bamec66557 commited on
Commit
68d2b27
·
verified ·
1 Parent(s): da223cb

Adding Evaluation Results (#1)

Browse files

- Adding Evaluation Results (d22da6d6d8f348100c31f3169056f3de262a1eee)

Files changed (1) hide show
  1. README.md +119 -11
README.md CHANGED
@@ -1,4 +1,14 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
2
  base_model:
3
  - bamec66557/MNRP_0.5
4
  - nbeerbower/mistral-nemo-wissenschaft-12B
@@ -7,16 +17,101 @@ base_model:
7
  - redrix/AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS
8
  - crestf411/MN-Slush
9
  - DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS
10
- library_name: transformers
11
- tags:
12
- - mergekit
13
- - merge
14
- - text-generation-inference
15
- - not-for-all-audiences
16
- license: apache-2.0
17
- language:
18
- - en
19
- - ko
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ---
21
 
22
  <a href="#" target="_blank"><img src="https://huggingface.co/bamec66557/MISCHIEVOUS-12B/resolve/main/00001-321918068.gif"></a>
@@ -67,4 +162,17 @@ parameters:
67
  epsilon: 0.05
68
  lambda: 1
69
  dtype: bfloat16
70
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ - ko
5
+ license: apache-2.0
6
+ library_name: transformers
7
+ tags:
8
+ - mergekit
9
+ - merge
10
+ - text-generation-inference
11
+ - not-for-all-audiences
12
  base_model:
13
  - bamec66557/MNRP_0.5
14
  - nbeerbower/mistral-nemo-wissenschaft-12B
 
17
  - redrix/AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS
18
  - crestf411/MN-Slush
19
  - DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS
20
+ model-index:
21
+ - name: MISCHIEVOUS-12B
22
+ results:
23
+ - task:
24
+ type: text-generation
25
+ name: Text Generation
26
+ dataset:
27
+ name: IFEval (0-Shot)
28
+ type: HuggingFaceH4/ifeval
29
+ args:
30
+ num_few_shot: 0
31
+ metrics:
32
+ - type: inst_level_strict_acc and prompt_level_strict_acc
33
+ value: 38.52
34
+ name: strict accuracy
35
+ source:
36
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=bamec66557/MISCHIEVOUS-12B
37
+ name: Open LLM Leaderboard
38
+ - task:
39
+ type: text-generation
40
+ name: Text Generation
41
+ dataset:
42
+ name: BBH (3-Shot)
43
+ type: BBH
44
+ args:
45
+ num_few_shot: 3
46
+ metrics:
47
+ - type: acc_norm
48
+ value: 34.07
49
+ name: normalized accuracy
50
+ source:
51
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=bamec66557/MISCHIEVOUS-12B
52
+ name: Open LLM Leaderboard
53
+ - task:
54
+ type: text-generation
55
+ name: Text Generation
56
+ dataset:
57
+ name: MATH Lvl 5 (4-Shot)
58
+ type: hendrycks/competition_math
59
+ args:
60
+ num_few_shot: 4
61
+ metrics:
62
+ - type: exact_match
63
+ value: 12.46
64
+ name: exact match
65
+ source:
66
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=bamec66557/MISCHIEVOUS-12B
67
+ name: Open LLM Leaderboard
68
+ - task:
69
+ type: text-generation
70
+ name: Text Generation
71
+ dataset:
72
+ name: GPQA (0-shot)
73
+ type: Idavidrein/gpqa
74
+ args:
75
+ num_few_shot: 0
76
+ metrics:
77
+ - type: acc_norm
78
+ value: 9.4
79
+ name: acc_norm
80
+ source:
81
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=bamec66557/MISCHIEVOUS-12B
82
+ name: Open LLM Leaderboard
83
+ - task:
84
+ type: text-generation
85
+ name: Text Generation
86
+ dataset:
87
+ name: MuSR (0-shot)
88
+ type: TAUR-Lab/MuSR
89
+ args:
90
+ num_few_shot: 0
91
+ metrics:
92
+ - type: acc_norm
93
+ value: 11.28
94
+ name: acc_norm
95
+ source:
96
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=bamec66557/MISCHIEVOUS-12B
97
+ name: Open LLM Leaderboard
98
+ - task:
99
+ type: text-generation
100
+ name: Text Generation
101
+ dataset:
102
+ name: MMLU-PRO (5-shot)
103
+ type: TIGER-Lab/MMLU-Pro
104
+ config: main
105
+ split: test
106
+ args:
107
+ num_few_shot: 5
108
+ metrics:
109
+ - type: acc
110
+ value: 29.69
111
+ name: accuracy
112
+ source:
113
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=bamec66557/MISCHIEVOUS-12B
114
+ name: Open LLM Leaderboard
115
  ---
116
 
117
  <a href="#" target="_blank"><img src="https://huggingface.co/bamec66557/MISCHIEVOUS-12B/resolve/main/00001-321918068.gif"></a>
 
162
  epsilon: 0.05
163
  lambda: 1
164
  dtype: bfloat16
165
+ ```
166
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
167
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/bamec66557__MISCHIEVOUS-12B-details)
168
+
169
+ | Metric |Value|
170
+ |-------------------|----:|
171
+ |Avg. |22.57|
172
+ |IFEval (0-Shot) |38.52|
173
+ |BBH (3-Shot) |34.07|
174
+ |MATH Lvl 5 (4-Shot)|12.46|
175
+ |GPQA (0-shot) | 9.40|
176
+ |MuSR (0-shot) |11.28|
177
+ |MMLU-PRO (5-shot) |29.69|
178
+