fblgit commited on
Commit
e1cdc5b
·
verified ·
1 Parent(s): c422045

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -2
README.md CHANGED
@@ -2,9 +2,33 @@
2
  license: apache-2.0
3
  datasets:
4
  - fblgit/simple-math
 
5
  tags:
6
  - UNA
 
 
7
  ---
8
 
9
- So far an experiment, not sure how it went.
10
- Is based on Smaug and used SimpleMath dataset.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  datasets:
4
  - fblgit/simple-math
5
+ base_model: abacusai/Smaug-34B-v0.1
6
  tags:
7
  - UNA
8
+ - simple-math
9
+ - juanako
10
  ---
11
 
12
+ # UNA-SimpleSmaug-34b-v1beta
13
+
14
+ So far an experiment, not sure how it went. Applied UNA only on the Attention, not on the MLP's
15
+ * Is based on Smaug
16
+ * SimpleMath dataset
17
+ * It was trained on Axolotl
18
+
19
+ ## Experiment
20
+ The thing here is to understand whats the impact of SimpleMath applied at the attention layer during a SFT session and how it impacts on the neural network overall.
21
+ ## Evals
22
+
23
+ Pending, but so far this one
24
+ ```
25
+ | Task |Version| Metric |Value | |Stderr|
26
+ |-------------|------:|--------|-----:|---|-----:|
27
+ |arc_challenge| 0|acc |0.7201|± |0.0131|
28
+ | | |acc_norm|0.7457|± |0.0127|
29
+ ```
30
+
31
+ Seems to increase GSM and ARC
32
+
33
+ ## Citations
34
+ To abacusai for making Smaug-34B, the Bagel, and all the magic behind the base model.