Scripts

This section contains a collection of useful scripts for quality control during the training of models.

visualize_transforms

visualize_transforms.run_visualization(fname_config, n_slices, folder_output, fname_roi)

Utility function to visualize Data Augmentation transformations.

Data augmentation is a key part of the Deep Learning training scheme. This script aims at facilitating the fine-tuning of data augmentation parameters. To do so, this script provides a step-by-step visualization of the transformations that are applied on data.

This function applies a series of transformations (defined in a configuration file -c) to -n 2D slices randomly extracted from an input image (-i), and save as png the resulting sample after each transform.

For example:

python visualize_transforms.py -i t2s.nii.gz -n 1 -c config.json -r t2s_seg.nii.gz

Provides a visualization of a series of three transformation on a randomly selected slice:

_images/transforms_im.png

And on a binary mask:

python visualize_transforms.py -i t2s_gmseg.nii.gz -n 1 -c config.json -r t2s_seg.nii.gz

Gives:

_images/transforms_gt.png
Parameters:
  • fname_input (string) – Image filename.
  • fname_config (string) – Configuration file filename.
  • n_slices (int) – Number of slices randomly extracted.
  • folder_output (string) – Folder path where the results are saved.
  • fname_roi (string) – Filename of the region of interest. Only needed if ROICrop is part of the transformations.

convert_to_onnx

convert_to_onnx.convert_pytorch_to_onnx(dimension, gpu=0)

Convert PyTorch model to ONNX.

The integration of Deep Learning models into the clinical routine requires cpu optimized models. To export the PyTorch models to ONNX format and to run the inference using ONNX Runtime is a time and memory efficient way to answer this need.

This function converts a model from PyTorch to ONNX format, with information of whether it is a 2D or 3D model (-d).

Parameters:
  • fname_model (string) – Model filename.
  • dimension (int) – Indicates whether the model is 2D or 3D. Choice between 2 or 3.
  • gpu (string) – GPU ID, if available

automate_training

automate_training.automate_training(fname_param, fixed_split, all_combinations, n_iterations=1, run_test=False)

Automate multiple training processes on multiple GPUs.

Hyperparameter optimization of models is tedious and time-consuming. This function automatizes this optimization across multiple GPUs. It runs trainings, on the same training and validation datasets, by combining a given set of parameters and set of values for each of these parameters. Results are collected for each combination and reported into a dataframe to allow their comparison. The script efficiently allocates each training to one of the available GPUs.

# TODO: add example of DF

Parameters:
  • fname_config (string) – Configuration filename, which is used as skeleton to configure the training. Some of its parameters (defined in fname_param file) are modified across experiments.
  • fname_param (string) –

    json file containing parameters configurations to compare. Parameter “keys” of this file need to match the parameter “keys” of fname_config file. Parameter “values” are in a list. Example:

    "default_model": {"depth": [2, 3, 4]}
    
  • fixed_split (bool) – If True, all the experiments are run on the same training/validation/testing subdatasets.
  • all_combinations (bool) – If True, all parameters combinations are run.
  • n_iterations (int) – Controls the number of time that each experiment (ie set of parameter) are run.
  • run_test (bool) – If True, the trained model is also run on the testing subdataset.

compare_models

compare_models.compute_statistics(n_iterations, run_test=True)

Compares the performance of models at inference time on a common testing dataset using paired t-tests.

It uses a dataframe generated by scripts/automate_training.py with the parameter --run-test (used to run the
models on the testing dataset).

# TODO: add example of DF

Parameters:
  • dataframe (pandas.Dataframe) – Dataframe of results generated by automate_training.
  • n_iterations (int) – Indicates the number of time that each experiment (ie set of parameter) was run.
  • run_test (int) – Indicates if the comparison is done on the performances on either the testing subdataset (True) either on the training/validation subdatasets.

prepare_dataset_vertebral_labeling

prepare_dataset_vertebral_labeling.extract_mid_slice_and_convert_coordinates_to_heatmaps(suffix, aim=-1)

This function takes as input a path to a dataset and generates a set of images: (i) mid-sagittal image and (ii) heatmap of disc labels associated with the mid-sagittal image.

Example:

python scripts/prepare_dataset_vertebral_labeling -p path/to/bids -s _T2w -a 0
Parameters:
  • bids_path (string) – path to BIDS dataset form which images will be generated
  • suffix (string) – suffix of image that will be processed (e.g., T2w)
  • aim (int) – If aim is not 0, retrieves only labels with value = aim, else create heatmap with all labels.
Returns:

None. Images are saved in BIDS folder

extract_small_dataset

extract_small_dataset.extract_small_dataset(ofolder, n=10, contrast_list=None, include_derivatives=True, seed=-1)

Extract small BIDS dataset from a larger BIDS dataset.

Example:

python extract_small_dataset.py -i path/to/BIDS/dataset -o path/of/small/BIDS/dataset -n 10 -c T1w,T2w -d 0 -s 1234
Parameters:
  • ifolder (str) – Input BIDS folder.
  • ofolder (str) – Output folder.
  • n (int) – Number of subjects in the output folder.
  • contrast_list (list) – List of image contrasts to include. If set to None, then all available contrasts are included.
  • include_derivatives (bool) – If True, derivatives/labels/ content is also copied, only the raw images otherwise.
  • seed (int) – Set np.random.RandomState to ensure reproducibility: the same subjects will be selected if the function is run several times on the same dataset. If set to -1, each function run is independent.