enable deterministic training

Browse files

Files changed (6) hide show

README.md +12 -9
configs/metadata.json +2 -1
configs/multi_gpu_evaluate.json +0 -1
configs/multi_gpu_train.json +0 -1
configs/train.json +1 -2
docs/README.md +12 -9

README.md CHANGED Viewed

@@ -105,7 +105,7 @@ Example `dataset.json` in output folder:
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_val_in_out.jpeg)
-## Scores
 This model achieves the following F1 score on the validation data provided as part of the dataset:
 - Train F1 score = 0.941
@@ -132,26 +132,29 @@ Confusion Metrics for <b>Training</b> for individual classes are (at epoch 50):
-## Training Performance
 A graph showing the training Loss and F1-score over 50 epochs.
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_train_loss_v2.png) <br>
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_train_f1_v2.png) <br>
-## Validation Performance
 A graph showing the validation F1-score over 50 epochs.
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_val_f1_v2.png) <br>
-## commands example
-Execute training:
 ```
 python -m monai.bundle run --config_file configs/train.json
 ```
-Override the `train` config to execute multi-GPU training:
 ```
 torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']"
@@ -160,19 +163,19 @@ torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config
 Please note that the distributed training related options depend on the actual running environment, thus you may need to remove `--standalone`, modify `--nnodes` or do some other necessary changes according to the machine you used.
 Please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html) for more details.
-Override the `train` config to execute evaluation with the trained model:
 ```
 python -m monai.bundle run --config_file "['configs/train.json','configs/evaluate.json']"
 ```
-Override the `train` config and `evaluate` config to execute multi-GPU evaluation:
 ```
 torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/evaluate.json','configs/multi_gpu_evaluate.json']"
 ```
-Execute inference:
 ```
 python -m monai.bundle run --config_file configs/inference.json

 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_val_in_out.jpeg)
+## Performance
 This model achieves the following F1 score on the validation data provided as part of the dataset:
 - Train F1 score = 0.941
+#### Training Performance
 A graph showing the training Loss and F1-score over 50 epochs.
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_train_loss_v2.png) <br>
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_train_f1_v2.png) <br>
+#### Validation Performance
 A graph showing the validation F1-score over 50 epochs.
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_val_f1_v2.png) <br>
+## MONAI Bundle Commands
+In addition to the Pythonic APIs, a few command line interfaces (CLI) are provided to interact with the bundle. The CLI supports flexible use cases, such as overriding configs at runtime and predefining arguments in a file.
+For more details usage instructions, visit the [MONAI Bundle Configuration Page](https://docs.monai.io/en/latest/config_syntax.html).
+#### Execute training:
 ```
 python -m monai.bundle run --config_file configs/train.json
 ```
+#### Override the `train` config to execute multi-GPU training:
 ```
 torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']"
 Please note that the distributed training related options depend on the actual running environment, thus you may need to remove `--standalone`, modify `--nnodes` or do some other necessary changes according to the machine you used.
 Please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html) for more details.
+#### Override the `train` config to execute evaluation with the trained model:
 ```
 python -m monai.bundle run --config_file "['configs/train.json','configs/evaluate.json']"
 ```
+#### Override the `train` config and `evaluate` config to execute multi-GPU evaluation:
 ```
 torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/evaluate.json','configs/multi_gpu_evaluate.json']"
 ```
+#### Execute inference:
 ```
 python -m monai.bundle run --config_file configs/inference.json

configs/metadata.json CHANGED Viewed

@@ -1,7 +1,8 @@
 {
     "schema": "https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_20220324.json",
-    "version": "0.0.7",
     "changelog": {
         "0.0.7": "update benchmark on A100",
         "0.0.6": "adapt to BundleWorkflow interface",
         "0.0.5": "add name tag",

 {
     "schema": "https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_20220324.json",
+    "version": "0.0.8",
     "changelog": {
+        "0.0.8": "enable deterministic training",
         "0.0.7": "update benchmark on A100",
         "0.0.6": "adapt to BundleWorkflow interface",
         "0.0.5": "add name tag",

configs/multi_gpu_evaluate.json CHANGED Viewed

@@ -21,7 +21,6 @@
         "$import torch.distributed as dist",
         "$dist.is_initialized() or dist.init_process_group(backend='nccl')",
         "$torch.cuda.set_device(@device)",
-        "$setattr(torch.backends.cudnn, 'benchmark', True)",
         "$import logging",
         "$@validate#evaluator.logger.setLevel(logging.WARNING if dist.get_rank() > 0 else logging.INFO)",
         "$import scripts",

         "$import torch.distributed as dist",
         "$dist.is_initialized() or dist.init_process_group(backend='nccl')",
         "$torch.cuda.set_device(@device)",
         "$import logging",
         "$@validate#evaluator.logger.setLevel(logging.WARNING if dist.get_rank() > 0 else logging.INFO)",
         "$import scripts",

configs/multi_gpu_train.json CHANGED Viewed

@@ -31,7 +31,6 @@
         "$dist.is_initialized() or dist.init_process_group(backend='nccl')",
         "$torch.cuda.set_device(@device)",
         "$monai.utils.set_determinism(seed=123)",
-        "$setattr(torch.backends.cudnn, 'benchmark', True)",
         "$import logging",
         "$@train#trainer.logger.setLevel(logging.WARNING if dist.get_rank() > 0 else logging.INFO)",
         "$@validate#evaluator.logger.setLevel(logging.WARNING if dist.get_rank() > 0 else logging.INFO)"

         "$dist.is_initialized() or dist.init_process_group(backend='nccl')",
         "$torch.cuda.set_device(@device)",
         "$monai.utils.set_determinism(seed=123)",
         "$import logging",
         "$@train#trainer.logger.setLevel(logging.WARNING if dist.get_rank() > 0 else logging.INFO)",
         "$@validate#evaluator.logger.setLevel(logging.WARNING if dist.get_rank() > 0 else logging.INFO)"

configs/train.json CHANGED Viewed

@@ -343,8 +343,7 @@
     "initialize": [
         "$import sys",
         "$sys.path.append(@bundle_root)",
-        "$monai.utils.set_determinism(seed=123)",
-        "$setattr(torch.backends.cudnn, 'benchmark', True)"
     ],
     "run": [
         "$@train#trainer.run()"

     "initialize": [
         "$import sys",
         "$sys.path.append(@bundle_root)",
+        "$monai.utils.set_determinism(seed=123)"
     ],
     "run": [
         "$@train#trainer.run()"

docs/README.md CHANGED Viewed

@@ -98,7 +98,7 @@ Example `dataset.json` in output folder:
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_val_in_out.jpeg)
-## Scores
 This model achieves the following F1 score on the validation data provided as part of the dataset:
 - Train F1 score = 0.941
@@ -125,26 +125,29 @@ Confusion Metrics for <b>Training</b> for individual classes are (at epoch 50):
-## Training Performance
 A graph showing the training Loss and F1-score over 50 epochs.
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_train_loss_v2.png) <br>
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_train_f1_v2.png) <br>
-## Validation Performance
 A graph showing the validation F1-score over 50 epochs.
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_val_f1_v2.png) <br>
-## commands example
-Execute training:
 ```
 python -m monai.bundle run --config_file configs/train.json
 ```
-Override the `train` config to execute multi-GPU training:
 ```
 torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']"
@@ -153,19 +156,19 @@ torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config
 Please note that the distributed training related options depend on the actual running environment, thus you may need to remove `--standalone`, modify `--nnodes` or do some other necessary changes according to the machine you used.
 Please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html) for more details.
-Override the `train` config to execute evaluation with the trained model:
 ```
 python -m monai.bundle run --config_file "['configs/train.json','configs/evaluate.json']"
 ```
-Override the `train` config and `evaluate` config to execute multi-GPU evaluation:
 ```
 torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/evaluate.json','configs/multi_gpu_evaluate.json']"
 ```
-Execute inference:
 ```
 python -m monai.bundle run --config_file configs/inference.json

 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_val_in_out.jpeg)
+## Performance
 This model achieves the following F1 score on the validation data provided as part of the dataset:
 - Train F1 score = 0.941
+#### Training Performance
 A graph showing the training Loss and F1-score over 50 epochs.
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_train_loss_v2.png) <br>
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_train_f1_v2.png) <br>
+#### Validation Performance
 A graph showing the validation F1-score over 50 epochs.
 ![](https://developer.download.nvidia.com/assets/Clara/Images/monai_pathology_classification_val_f1_v2.png) <br>
+## MONAI Bundle Commands
+In addition to the Pythonic APIs, a few command line interfaces (CLI) are provided to interact with the bundle. The CLI supports flexible use cases, such as overriding configs at runtime and predefining arguments in a file.
+For more details usage instructions, visit the [MONAI Bundle Configuration Page](https://docs.monai.io/en/latest/config_syntax.html).
+#### Execute training:
 ```
 python -m monai.bundle run --config_file configs/train.json
 ```
+#### Override the `train` config to execute multi-GPU training:
 ```
 torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']"
 Please note that the distributed training related options depend on the actual running environment, thus you may need to remove `--standalone`, modify `--nnodes` or do some other necessary changes according to the machine you used.
 Please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html) for more details.
+#### Override the `train` config to execute evaluation with the trained model:
 ```
 python -m monai.bundle run --config_file "['configs/train.json','configs/evaluate.json']"
 ```
+#### Override the `train` config and `evaluate` config to execute multi-GPU evaluation:
 ```
 torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/evaluate.json','configs/multi_gpu_evaluate.json']"
 ```
+#### Execute inference:
 ```
 python -m monai.bundle run --config_file configs/inference.json