kaczmarj commited on
Commit
c1c16c2
·
verified ·
1 Parent(s): 11b5e05

upload model weights, readme, and config

Browse files
README.md CHANGED
@@ -1,3 +1,76 @@
1
- ---
2
- license: cc-by-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Pancancer tissue classifier
2
+
3
+ This model classifies among 32 cancers from TCGA. It was trained by Jakub Kaczmarzyk using CLAM.
4
+
5
+ Output classes: ACC, BLCA, BRCA, CESC, CHOL, COAD, DLBC, ESCA, GBM, HNSC, KICH, KIRC, KIRP, LGG, LIHC, LUAD, LUSC, MESO, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, TGCT, THCA, THYM, UCEC, UCS, UVM.
6
+
7
+ Please see the [TCGA study abbreviations](https://gdc.cancer.gov/resources-tcga-users/tcga-code-tables/tcga-study-abbreviations) to map these class names to the TCGA study names.
8
+
9
+ ## Data
10
+
11
+ Diagnostic slides in TCGA (e.g., `DX`) were used to train the model. The whole slide images were tiles into 128x128um patches, and each patch was encoded using CTransPath (this produces 768-dimensional embeddings).
12
+
13
+ Train, validation, and test splits were stratified by TCGA study, and patients did not cross split boundaries.
14
+
15
+ Samples sizes:
16
+ - Train: 9,257 slides (7,633 patients)
17
+ - Validation: 1,186 slides (955 patients)
18
+ - Test: 1,163 slides (955 patients)
19
+
20
+ ## Model performance
21
+
22
+ The model achieved a weighted average AUROC of 0.99 (one-vs-rest).
23
+
24
+ Here are the one-vs-rest AUROC values for each TCGA study.
25
+
26
+ - ACC: 0.9993
27
+ - BLCA: 0.9814
28
+ - BRCA: 0.9908
29
+ - CESC: 0.9868
30
+ - CHOL: 0.9972
31
+ - COAD: 0.9927
32
+ - DLBC: 0.9996
33
+ - ESCA: 0.9571
34
+ - GBM: 0.9984
35
+ - HNSC: 0.9974
36
+ - KICH: 0.9998
37
+ - KIRC: 0.9993
38
+ - KIRP: 0.9952
39
+ - LGG: 0.9984
40
+ - LIHC: 0.9988
41
+ - LUAD: 0.9879
42
+ - LUSC: 0.9868
43
+ - MESO: 0.9961
44
+ - OV: 0.9900
45
+ - PAAD: 0.9897
46
+ - PCPG: 0.9944
47
+ - PRAD: 1.0000
48
+ - READ: 0.9752
49
+ - SARC: 0.9946
50
+ - SKCM: 0.9957
51
+ - STAD: 0.9932
52
+ - TGCT: 0.9957
53
+ - THCA: 1.0000
54
+ - THYM: 0.9991
55
+ - UCEC: 0.9971
56
+ - UCS: 0.9863
57
+ - UVM: 0.9997
58
+
59
+ ### Renal cell carcinoma (RCC) subtyping
60
+
61
+ RCC subtyping is a relatively common benchmark task for slide-level classification. We evaluate this model on RCC subtyping.
62
+
63
+ When tested on a set of 52 KIRC slides and 28 KIRP slides (from the overall test set), the model achieved a balanced accuracy of 0.88.
64
+
65
+ ### Non-small cell lung cancer (NSCLC) subtyping
66
+
67
+ NSCLC subtyping is a relatively common benchmark task for slide-level classification. We evaluate this model on NSCLC subtyping.
68
+
69
+ When tested on a set of 55 LUAD slides and 58 LUSC slides (from the overall test set), the model achieved a balanced accuracy of 0.76.
70
+
71
+
72
+ # Intended uses
73
+
74
+ This model is ONLY intended for research purposes.
75
+
76
+ **This model may not be used for clinical purposes.** This model is distributed without warranties, either express or implied.
config.json ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "spec_version": "1.0",
3
+ "type": "clam",
4
+ "patch_size_um": 128,
5
+ "feature_extractor": "ctranspath",
6
+ "num_classes": 32,
7
+ "class_names": [
8
+ "ACC",
9
+ "BLCA",
10
+ "BRCA",
11
+ "CESC",
12
+ "CHOL",
13
+ "COAD",
14
+ "DLBC",
15
+ "ESCA",
16
+ "GBM",
17
+ "HNSC",
18
+ "KICH",
19
+ "KIRC",
20
+ "KIRP",
21
+ "LGG",
22
+ "LIHC",
23
+ "LUAD",
24
+ "LUSC",
25
+ "MESO",
26
+ "OV",
27
+ "PAAD",
28
+ "PCPG",
29
+ "PRAD",
30
+ "READ",
31
+ "SARC",
32
+ "SKCM",
33
+ "STAD",
34
+ "TGCT",
35
+ "THCA",
36
+ "THYM",
37
+ "UCEC",
38
+ "UCS",
39
+ "UVM"
40
+ ]
41
+ }
model-with-instance-classifiers.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8c15dcf4dd1d901acd0581850edae97d837fbc920c6f0592baebcb7c7aa2542e
3
+ size 2830572
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e0539cb88046f2b8515c149b525af8f05e505e5e81e59b5405ebdf07b64e4f4
3
+ size 2693188
torchscript_model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1c9737b34ba3c1041de80e9b1f42096ebf11438acf0a80d542fdf8d9aed7ed98
3
+ size 2711792