hawei_LinkedIn commited on
Commit
8da1cd7
·
1 Parent(s): 851c617

upload model weights and model card

Browse files
.gitattributes CHANGED
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ *.json filter=lfs diff=lfs merge=lfs -text
38
+ *.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,114 @@
1
  ---
2
  license: llama3.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama3.1
3
+ datasets:
4
+ - nvidia/OpenMathInstruct-2
5
+ language:
6
+ - en
7
+ base_model:
8
+ - meta-llama/Llama-3.1-8B-Instruct
9
+ model-index:
10
+ - name: Control-LLM-Llama3.1-8B-Math16
11
+ results:
12
+ - task:
13
+ type: code-evaluation
14
+ dataset:
15
+ type: mixed
16
+ name: Code Evaluation Dataset
17
+ metrics:
18
+ - name: pass_at_1,n=1 (code_instruct)
19
+ type: pass_at_1
20
+ value: 0.7840083073727934
21
+ stderr: 0.013257237506304915
22
+ verified: false
23
+ - name: pass_at_1,n=1 (humaneval_greedy_instruct)
24
+ type: pass_at_1
25
+ value: 0.8170731707317073
26
+ stderr: 0.03028135999593353
27
+ verified: false
28
+ - name: pass_at_1,n=1 (humaneval_plus_greedy_instruct)
29
+ type: pass_at_1
30
+ value: 0.7439024390243902
31
+ stderr: 0.03418746588364997
32
+ verified: false
33
+ - name: pass_at_1,n=1 (mbpp_plus_0shot_instruct)
34
+ type: pass_at_1
35
+ value: 0.8042328042328042
36
+ stderr: 0.0204357309715418
37
+ verified: false
38
+ - name: pass_at_1,n=1 (mbpp_sanitized_0shot_instruct)
39
+ type: pass_at_1
40
+ value: 0.7587548638132295
41
+ stderr: 0.02673991635681605
42
+ verified: false
43
+ - task:
44
+ type: original-capability
45
+ dataset:
46
+ type: meta/Llama-3.1-8B-Instruct-evals
47
+ name: Llama-3.1-8B-Instruct-evals Dataset
48
+ dataset_path: "meta-llama/llama-3.1-8_b-instruct-evals"
49
+ dataset_name: "Llama-3.1-8B-Instruct-evals__arc_challenge__details"
50
+ metrics:
51
+ - name: exact_match,strict-match (original_capability_instruct)
52
+ type: exact_match
53
+ value: 0.5630801459168563
54
+ stderr: 0.0028483348465514185
55
+ verified: false
56
+ - name: exact_match,strict-match (meta_arc_0shot_instruct)
57
+ type: exact_match
58
+ value: 0.8248927038626609
59
+ stderr: 0.01113972223585952
60
+ verified: false
61
+ - name: exact_match,strict-match (meta_gpqa_0shot_cot_instruct)
62
+ type: exact_match
63
+ value: 0.296875
64
+ stderr: 0.021609729061250887
65
+ verified: false
66
+ - name: exact_match,strict-match (meta_mmlu_0shot_instruct)
67
+ type: exact_match
68
+ value: 0.6815980629539952
69
+ stderr: 0.003931452244804845
70
+ verified: false
71
+ - name: exact_match,strict-match (meta_mmlu_pro_5shot_instruct)
72
+ type: exact_match
73
+ value: 0.4093251329787234
74
+ stderr: 0.004482884901882547
75
+ verified: false
76
  ---
77
+ # Control-LLM-Llama3.1-8B-Math16
78
+ This is a fine-tuned model of Llama-3.1-8B-Instruct for mathematical tasks on OpenCoder SFT dataset.
79
+
80
+ ## Evaluation Results
81
+ Here is an overview of the evaluation results and findings:
82
+
83
+ ### Benchmark Result and Catastrophic Forgetting on OpenCoder
84
+ The following plot illustrates benchmark result and catastrophic forgetting mitigation on the OpenCoder SFT dataset.
85
+
86
+ ![Catastrophic Forgetting](plots/catastrophic_forgetting_opencoder.png)
87
+
88
+ ### Benchmark Results Table
89
+ The table below summarizes evaluation results across coding tasks and original capabilities.
90
+
91
+ | **Model** | **MB+** | **MS** | **HE+** | **HE** | **C-Avg** | **ARC** | **GP** | **MLU** | **MLUP** | **O-Avg** | **Overall** |
92
+ |--------------------|---------|---------|---------|---------|-----------|---------|---------|---------|----------|-----------|-------------|
93
+ | Llama3.1-8B-Ins | 70.4 | 67.7 | 66.5 | 70.7 | 69.1 | 83.4 | 29.9 | 72.4 | 46.7 | 60.5 | 64.8 |
94
+ | OpenCoder-8B-Ins | 81.2 | 76.3 | 78.0 | 82.3 | 79.5 | 8.2 | 25.4 | 37.4 | 11.3 | 24.6 | 52.1 |
95
+ | **Full Param Tune**| 75.1 | 69.6 | 71.3 | 76.8 | 73.3 | 24.4 | 21.9 | 43.0 | 19.2 | 31.5 | 52.4 |
96
+ | Partial Param Tune | 75.7 | 71.6 | 74.4 | 79.3 | 75.0 | 70.2 | 28.1 | 60.7 | 32.4 | 48.3 | 61.7 |
97
+ | Stack Expansion | 77.2 | 72.8 | 73.2 | 78.7 | 75.6 | 80.0 | 26.3 | 66.6 | 38.2 | 54.2 | 64.9 |
98
+ | Hybrid Expansion* | 77.5 | 73.5 | **76.2**| **82.3**| 77.1 | 80.9 | **32.6**| 68.1 | 40.3 | 56.0 | 66.6 |
99
+ | **Control LLM*** | **80.4**| **75.9**| 74.4 | 81.1 | **78.3** | **82.5**| 29.7 | **68.2**| **40.9** | **56.3** | **67.3** |
100
+
101
+ ---
102
+
103
+ ### Explanation:
104
+ - **MB+**: MBPP Plus
105
+ - **MS**: MBPP Sanitized
106
+ - **HE+**: HumanEval Plus
107
+ - **HE**: HumanEval
108
+ - **C-Avg**: Coding - Size Weighted Average across MB+, MS, HE+, and HE
109
+ - **ARC**: ARC benchmark
110
+ - **GP**: GPQA benchmark
111
+ - **MLU**: MMLU (Massive Multitask Language Understanding)
112
+ - **MLUP**: MMLU Pro
113
+ - **O-Avg**: Original Capability - Size Weighted Average across ARC, GPQA, MMLU, and MMLU Pro
114
+ - **Overall**: Combined average across all tasks
benchmark_results_code_instruct.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a6199e12ee08cddf074ddb4712b39440e8be166bd99d45eeec1e7050b3252aea
3
+ size 18321
benchmark_results_original_capability_instruct.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b236479e2183f37eae7caef1adce93f20147ac2be478c8ffc893a42583b6c699
3
+ size 17231
config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c3ac4fc4f51ac8ae6a23d7425533c9ce7a91c5b5a7eda80c2909f0b56336fc01
3
+ size 1747
evaluation_results.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4f4a88dc174e1604af1e83f0d6d13c71862a65e9bf646511270a3d9f141b2ee4
3
+ size 921
generation_config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e3dea9cdfa4a3e0edf1b52d9b1e9609aa22efcddce630060b748284663c8e67
3
+ size 177
model-00001-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:51f918e91dec283d291d864aaa45d69ffec7a757c4eb47e0b028d0034e6b9268
3
+ size 4976765742
model-00002-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:44f2bf1cf8eddf0c3b76952a4ec902ceffffe1fdad84766fee02acd585252dde
3
+ size 4999853582
model-00003-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:928b8ee9f7f26f1ab43be8e36f1bbcdb0b5b2c127bf94585180c2dd30cd11311
3
+ size 4915983640
model-00004-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4218fea139ca82ff487a993df7323a28c367c27f9920013753e420036e694b12
3
+ size 4999886888
model-00005-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6a1674c2c65e088709514f377d59d691c1454bcdfcfb875f820fadb2e9512009
3
+ size 3147970380
model.safetensors.index.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:99a0746418bc2b35797d04c9fd6460fcf4e4de768b7aa9ab4b5f427b52099448
3
+ size 43015
plots/catastrophic_forgetting_opencoder.png ADDED

Git LFS Details

  • SHA256: c5fea7e309d3531a58cd0cbb03964c543f445cc858ef75b2ad1132eccf39bd1a
  • Pointer size: 131 Bytes
  • Size of remote file: 479 kB
special_tokens_map.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:453db79c09538b7953c4d9846d4bc0b46b86a296f285cdecc29f739f0b98f6a9
3
+ size 572
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a200d62d1a1177908f4310d7e367f0194d474db0038dc1f2f2434a3ef74af7d9
3
+ size 17210284
tokenizer_config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e407a45cd60fbcdad88f1a22adab70157c47e858c0c94995de05e87b06205aa
3
+ size 55820