Model metrics
Model testing was performed in the held-out test set of the dataset. The Dice similarity index (Dice) and the normalized surface distance (NSD) were calculated for each label individually, and 95% confidence were computed using bootstrap resampling with 1000 iterations.
Class ID | Class Description | Dice | NSD |
---|---|---|---|
0 | background | 1.0 [1.0 - 1.0] | 0.999 [0.999 - 1.0] |
1 | T1 | 0.936 [0.915 - 0.951] | 0.976 [0.955 - 0.989] |
2 | T2 | 0.947 [0.925 - 0.962] | 0.986 [0.966 - 0.998] |
3 | T3 | 0.954 [0.94 - 0.965] | 0.993 [0.982 - 0.999] |
4 | T4 | 0.944 [0.92 - 0.963] | 0.981 [0.961 - 0.996] |
5 | T5 | 0.946 [0.925 - 0.964] | 0.984 [0.965 - 0.998] |
6 | T6 | 0.936 [0.906 - 0.96] | 0.972 [0.946 - 0.991] |
7 | T7 | 0.924 [0.886 - 0.952] | 0.962 [0.933 - 0.985] |
8 | T8 | 0.916 [0.876 - 0.949] | 0.951 [0.919 - 0.978] |
9 | T9 | 0.921 [0.886 - 0.95] | 0.953 [0.924 - 0.976] |
10 | T10 | 0.924 [0.89 - 0.951] | 0.955 [0.926 - 0.977] |
11 | T11 | 0.919 [0.884 - 0.95] | 0.947 [0.914 - 0.976] |
12 | T12 | 0.927 [0.892 - 0.955] | 0.952 [0.918 - 0.98] |
13 | L1 | 0.926 [0.893 - 0.954] | 0.95 [0.919 - 0.977] |
14 | L2 | 0.948 [0.921 - 0.968] | 0.969 [0.943 - 0.988] |
15 | L3 | 0.939 [0.908 - 0.963] | 0.958 [0.927 - 0.981] |
16 | L4 | 0.92 [0.884 - 0.947] | 0.942 [0.908 - 0.966] |
17 | L5 | 0.911 [0.876 - 0.941] | 0.937 [0.906 - 0.963] |
18 | L6 | 0.0 [0.0 - 0.0] | 0.0 [0.0 - 0.0] |
19 | Sacrum | 0.955 [0.946 - 0.962] | 0.981 [0.973 - 0.988] |
20 | Os coccygis | NA | NA |
21 | T13 | 0.0 [0.0 - 0.0] | 0.0 [0.0 - 0.0] |