Optimize Hyperparameters

For training, you may want to determine which hyperparameters will be the best to use. To do this, we have the function ivadomed_automate_training.

Step 1: Download Example Data

To download the dataset (~490MB), run the following command in your terminal:

ivadomed_download_data -d data_example_spinegeneric

Step 2: Copy Configuration File

In ivadomed, training is orchestrated by a configuration file. Examples of configuration files are available in the ivadomed/config/ and the documentation is available in Configuration File.

In this tutorial we will use the configuration file: ivadomed/config/config.json. Copy this configuration file in your local directory (to avoid modifying the source file):

cp <PATH_TO_IVADOMED>/ivadomed/config/config.json .

Then, open it with a text editor. Below we will discuss some of the key parameters to perform a one-class 2D segmentation training.

Step 3: Create Hyperparameters Config File

The hyperparameter config file should have the same layout as the config file. To select a hyperparameter you would like to vary, just list the different options under the appropriate key.

In our example, we have 3 hyperparameters we would like to vary: batch_size, loss, and depth. In your directory, create a new file called: config_hyper.json. Open this in a text editor and add the following:

{
    "training_parameters": {
        "batch_size": [2, 64],
        "loss": [
            {"name": "DiceLoss"},
            {"name": "FocalLoss", "gamma": 0.2, "alpha" : 0.5}
        ]
    },
    "default_model": {"depth": [2, 3, 4]}
}

Step 4: (Optional) Change the Training Epochs

The default number of training epochs in the config.json file is 100; however, depending on your computer, this could be quite slow (especially if you don’t have any GPUs).

To change the number of epochs, open the config.json file and change the following:

{
    "training_parameters": {
        "training_time": {
            "num_epochs": 1
        }
    }
}

Step 5: Run the Code

Default

If neither all-combin nor multi-params is selected, then the hyperparameters will be combined as follows into a config_list.

Note

I am not showing the actual config_list here as it would take up too much space. The options listed below are incorporated into the base config file in config.json.

To run this:

ivadomed_automate_training -c config.json -ch config_hyper.json \
-n 1

All Combinations

If the flag all-combin is selected, the hyperparameter options will be combined combinatorically.

To run:

ivadomed_automate_training -c config.json -ch config_hyper.json \
-n 1 --all-combin

Multiple Parameters

If the flag multi-params is selected, the elements from each hyperparameter list will be selected sequentially, so all the first elements, then all the second elements, etc. If the lists are different lengths, say len(list_a) = n and len(list_b) = n+m, where n and m are strictly positive integers, then we will only use the first n elements.

To run:

ivadomed_automate_training -c config.json -ch config_hyper.json \
-n 1 --multi-params

Step 6: Results

There will be an output file called detailed_results.csv. This file gives an overview of the results from all the different trials. For a more fine-grained analysis, you can also look at each of the log directories (there is one for each config option).

An example of the detailed_results.csv:

  path_output training_parameters default_model best_training_dice best_training_loss best_validation_dice best_validation_loss
0 spineGeneric-batch_size=2 {‘batch_size’: 2, ‘loss’: {‘name’: ‘DiceLoss’}, ‘training_time’: {‘num_epochs’: 1, ‘early_stopping_patience’: 50, ‘early_stopping_epsilon’: 0.001}, ‘scheduler’: {‘initial_lr’: 0.001, ‘lr_scheduler’: {‘name’: ‘CosineAnnealingLR’, ‘base_lr’: 1e-05, ‘max_lr’: 0.01}}, ‘balance_samples’: {‘applied’: False, ‘type’: ‘gt’}, ‘mixup_alpha’: None, ‘transfer_learning’: {‘retrain_model’: None, ‘retrain_fraction’: 1.0, ‘reset’: True}} {‘name’: ‘Unet’, ‘dropout_rate’: 0.3, ‘bn_momentum’: 0.9, ‘depth’: 3, ‘is_2d’: True} -0.13313321973048692 -0.13313321973048692 -0.14559978920411557 -0.14559978920411557
2 spineGeneric-loss={‘name’: ‘DiceLoss’} {‘batch_size’: 18, ‘loss’: {‘name’: ‘DiceLoss’}, ‘training_time’: {‘num_epochs’: 1, ‘early_stopping_patience’: 50, ‘early_stopping_epsilon’: 0.001}, ‘scheduler’: {‘initial_lr’: 0.001, ‘lr_scheduler’: {‘name’: ‘CosineAnnealingLR’, ‘base_lr’: 1e-05, ‘max_lr’: 0.01}}, ‘balance_samples’: {‘applied’: False, ‘type’: ‘gt’}, ‘mixup_alpha’: None, ‘transfer_learning’: {‘retrain_model’: None, ‘retrain_fraction’: 1.0, ‘reset’: True}} {‘name’: ‘Unet’, ‘dropout_rate’: 0.3, ‘bn_momentum’: 0.9, ‘depth’: 3, ‘is_2d’: True} -0.03612175240414217 -0.03612175240414217 -0.07506937285264333 -0.07506937285264333
5 spineGeneric-depth=3 {‘batch_size’: 18, ‘loss’: {‘name’: ‘DiceLoss’}, ‘training_time’: {‘num_epochs’: 1, ‘early_stopping_patience’: 50, ‘early_stopping_epsilon’: 0.001}, ‘scheduler’: {‘initial_lr’: 0.001, ‘lr_scheduler’: {‘name’: ‘CosineAnnealingLR’, ‘base_lr’: 1e-05, ‘max_lr’: 0.01}}, ‘balance_samples’: {‘applied’: False, ‘type’: ‘gt’}, ‘mixup_alpha’: None, ‘transfer_learning’: {‘retrain_model’: None, ‘retrain_fraction’: 1.0, ‘reset’: True}} {‘name’: ‘Unet’, ‘dropout_rate’: 0.3, ‘bn_momentum’: 0.9, ‘depth’: 3, ‘is_2d’: True} -0.0344025717349723 -0.0344025717349723 -0.06566549402972062 -0.06566549402972062
6 spineGeneric-depth=4 {‘batch_size’: 18, ‘loss’: {‘name’: ‘DiceLoss’}, ‘training_time’: {‘num_epochs’: 1, ‘early_stopping_patience’: 50, ‘early_stopping_epsilon’: 0.001}, ‘scheduler’: {‘initial_lr’: 0.001, ‘lr_scheduler’: {‘name’: ‘CosineAnnealingLR’, ‘base_lr’: 1e-05, ‘max_lr’: 0.01}}, ‘balance_samples’: {‘applied’: False, ‘type’: ‘gt’}, ‘mixup_alpha’: None, ‘transfer_learning’: {‘retrain_model’: None, ‘retrain_fraction’: 1.0, ‘reset’: True}} {‘name’: ‘Unet’, ‘dropout_rate’: 0.3, ‘bn_momentum’: 0.9, ‘depth’: 4, ‘is_2d’: True} -0.02962107036728412 -0.02962107036728412 -0.06145078005890051 -0.06145078005890051
4 spineGeneric-depth=2 {‘batch_size’: 18, ‘loss’: {‘name’: ‘DiceLoss’}, ‘training_time’: {‘num_epochs’: 1, ‘early_stopping_patience’: 50, ‘early_stopping_epsilon’: 0.001}, ‘scheduler’: {‘initial_lr’: 0.001, ‘lr_scheduler’: {‘name’: ‘CosineAnnealingLR’, ‘base_lr’: 1e-05, ‘max_lr’: 0.01}}, ‘balance_samples’: {‘applied’: False, ‘type’: ‘gt’}, ‘mixup_alpha’: None, ‘transfer_learning’: {‘retrain_model’: None, ‘retrain_fraction’: 1.0, ‘reset’: True}} {‘name’: ‘Unet’, ‘dropout_rate’: 0.3, ‘bn_momentum’: 0.9, ‘depth’: 2, ‘is_2d’: True} -0.03455661813495681 -0.03455661813495681 -0.06135410505036513 -0.06135410505036513
1 spineGeneric-batch_size=64 {‘batch_size’: 64, ‘loss’: {‘name’: ‘DiceLoss’}, ‘training_time’: {‘num_epochs’: 1, ‘early_stopping_patience’: 50, ‘early_stopping_epsilon’: 0.001}, ‘scheduler’: {‘initial_lr’: 0.001, ‘lr_scheduler’: {‘name’: ‘CosineAnnealingLR’, ‘base_lr’: 1e-05, ‘max_lr’: 0.01}}, ‘balance_samples’: {‘applied’: False, ‘type’: ‘gt’}, ‘mixup_alpha’: None, ‘transfer_learning’: {‘retrain_model’: None, ‘retrain_fraction’: 1.0, ‘reset’: True}} {‘name’: ‘Unet’, ‘dropout_rate’: 0.3, ‘bn_momentum’: 0.9, ‘depth’: 3, ‘is_2d’: True} -0.023551362939178942 -0.023551362939178942 -0.04692324437201023 -0.04692324437201023
3 spineGeneric-loss={‘name’: ‘FocalLoss’, ‘gamma’: 0.2, ‘alpha’: 0.5} {‘batch_size’: 18, ‘loss’: {‘name’: ‘FocalLoss’, ‘gamma’: 0.2, ‘alpha’: 0.5}, ‘training_time’: {‘num_epochs’: 1, ‘early_stopping_patience’: 50, ‘early_stopping_epsilon’: 0.001}, ‘scheduler’: {‘initial_lr’: 0.001, ‘lr_scheduler’: {‘name’: ‘CosineAnnealingLR’, ‘base_lr’: 1e-05, ‘max_lr’: 0.01}}, ‘balance_samples’: {‘applied’: False, ‘type’: ‘gt’}, ‘mixup_alpha’: None, ‘transfer_learning’: {‘retrain_model’: None, ‘retrain_fraction’: 1.0, ‘reset’: True}} {‘name’: ‘Unet’, ‘dropout_rate’: 0.3, ‘bn_momentum’: 0.9, ‘depth’: 3, ‘is_2d’: True} -0.015067462751176208 84990.18334960938 -0.018857145681977272 66983.3583984375