pszemraj commited on
Commit
6ee299f
·
verified ·
1 Parent(s): 89176c1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -2
README.md CHANGED
@@ -749,12 +749,20 @@ Thus far, all completed in fp32 (_using nvidia tf32 dtype behind the scenes when
749
  | SST2 | 90.6% | - | - | - | - | 0.2464 |
750
  | QNLI | 89.6% | - | - | - | - | 0.2891 |
751
  | MRPC | 84.07% | 86.59 | - | - | - | 0.3759 |
752
- | STSB | - | 92.07 | 92.23 | 91.92 | - | 0.4103 |
753
  | MNLI | 82.2% | - | - | - | - | 0.4602 |
754
- | CoLA | - | - | - | - | 60.72 | 0.4569 |
755
  | RTE | 66.43% | - | - | - | - | 0.6981 |
756
  | WNLI | 35.21% | - | - | - | - | 0.7425 |
757
 
 
 
 
 
 
 
 
 
758
  ### Observations:
759
 
760
  - **Performance Variation**: There's notable variation in model performance across different GLUE tasks. This variation can be attributed to the distinct nature of each task, the complexity of the datasets, and how well the model's architecture and hyperparameters are suited to each task.
 
749
  | SST2 | 90.6% | - | - | - | - | 0.2464 |
750
  | QNLI | 89.6% | - | - | - | - | 0.2891 |
751
  | MRPC | 84.07% | 86.59 | - | - | - | 0.3759 |
752
+ | STSB | - | 92.07 | 0.9223 | 0.9192 | - | 0.4103 |
753
  | MNLI | 82.2% | - | - | - | - | 0.4602 |
754
+ | CoLA | - | - | - | - | 0.6072 | 0.4569 |
755
  | RTE | 66.43% | - | - | - | - | 0.6981 |
756
  | WNLI | 35.21% | - | - | - | - | 0.7425 |
757
 
758
+ 8-layer BERT with standard 512 ctx:
759
+
760
+
761
+ | Model | CoLA | SST-2 | MRPC | STS-B | QNLI | WNLI | RTE |
762
+ |-----------------------------|------|-------|--------|-------|------|------|--------|
763
+ | bert_uncased_L-8_H-768_A-12 | 0.54 | 0.91 | 0.88 | 0.93 | 0.90 | 0.34 | 0.67 |
764
+
765
+
766
  ### Observations:
767
 
768
  - **Performance Variation**: There's notable variation in model performance across different GLUE tasks. This variation can be attributed to the distinct nature of each task, the complexity of the datasets, and how well the model's architecture and hyperparameters are suited to each task.