ibibrahim commited on
Commit
d91cbed
·
verified ·
1 Parent(s): cf1a1df

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -42
README.md CHANGED
@@ -12,8 +12,8 @@ model-index:
12
  - task:
13
  type: text-generation
14
  dataset:
15
- type: human-exams
16
- name: MMLU
17
  metrics:
18
  - name: pass@1
19
  type: pass@1
@@ -22,8 +22,8 @@ model-index:
22
  - task:
23
  type: text-generation
24
  dataset:
25
- type: human-exams
26
- name: MMLU-Pro
27
  metrics:
28
  - name: pass@1
29
  type: pass@1
@@ -32,8 +32,8 @@ model-index:
32
  - task:
33
  type: text-generation
34
  dataset:
35
- type: human-exams
36
- name: AGI-Eval
37
  metrics:
38
  - name: pass@1
39
  type: pass@1
@@ -42,8 +42,8 @@ model-index:
42
  - task:
43
  type: text-generation
44
  dataset:
45
- type: commonsense
46
- name: WinoGrande
47
  metrics:
48
  - name: pass@1
49
  type: pass@1
@@ -52,18 +52,18 @@ model-index:
52
  - task:
53
  type: text-generation
54
  dataset:
55
- type: commonsense
56
- name: OBQA
57
  metrics:
58
  - name: pass@1
59
  type: pass@1
60
- value: 39.00
61
  veriefied: false
62
  - task:
63
  type: text-generation
64
  dataset:
65
- type: commonsense
66
- name: SIQA
67
  metrics:
68
  - name: pass@1
69
  type: pass@1
@@ -72,8 +72,8 @@ model-index:
72
  - task:
73
  type: text-generation
74
  dataset:
75
- type: commonsense
76
- name: PIQA
77
  metrics:
78
  - name: pass@1
79
  type: pass@1
@@ -82,8 +82,8 @@ model-index:
82
  - task:
83
  type: text-generation
84
  dataset:
85
- type: commonsense
86
- name: Hellaswag
87
  metrics:
88
  - name: pass@1
89
  type: pass@1
@@ -92,8 +92,8 @@ model-index:
92
  - task:
93
  type: text-generation
94
  dataset:
95
- type: commonsense
96
- name: TruthfulQA
97
  metrics:
98
  - name: pass@1
99
  type: pass@1
@@ -102,8 +102,8 @@ model-index:
102
  - task:
103
  type: text-generation
104
  dataset:
105
- type: reading-comprehension
106
- name: BoolQ
107
  metrics:
108
  - name: pass@1
109
  type: pass@1
@@ -112,8 +112,8 @@ model-index:
112
  - task:
113
  type: text-generation
114
  dataset:
115
- type: reading-comprehension
116
- name: SQuAD 2.0
117
  metrics:
118
  - name: pass@1
119
  type: pass@1
@@ -122,8 +122,8 @@ model-index:
122
  - task:
123
  type: text-generation
124
  dataset:
125
- type: reasoning
126
- name: ARC-C
127
  metrics:
128
  - name: pass@1
129
  type: pass@1
@@ -132,8 +132,8 @@ model-index:
132
  - task:
133
  type: text-generation
134
  dataset:
135
- type: reasoning
136
- name: GPQA
137
  metrics:
138
  - name: pass@1
139
  type: pass@1
@@ -142,8 +142,8 @@ model-index:
142
  - task:
143
  type: text-generation
144
  dataset:
145
- type: reasoning
146
- name: BBH
147
  metrics:
148
  - name: pass@1
149
  type: pass@1
@@ -152,8 +152,8 @@ model-index:
152
  - task:
153
  type: text-generation
154
  dataset:
155
- type: reasoning
156
- name: MUSR
157
  metrics:
158
  - name: pass@1
159
  type: pass@1
@@ -162,8 +162,8 @@ model-index:
162
  - task:
163
  type: text-generation
164
  dataset:
165
- type: code
166
- name: HumanEval
167
  metrics:
168
  - name: pass@1
169
  type: pass@1
@@ -172,33 +172,34 @@ model-index:
172
  - task:
173
  type: text-generation
174
  dataset:
175
- type: code
176
- name: MBPP
177
  metrics:
178
  - name: pass@1
179
  type: pass@1
180
- value: 23.20
181
- veriefied: false
182
  - task:
183
  type: text-generation
184
  dataset:
185
- type: math
186
- name: GSM8K
187
  metrics:
188
  - name: pass@1
189
  type: pass@1
190
  value: 19.26
191
- veriefied: false
192
  - task:
193
  type: text-generation
194
  dataset:
195
- type: math
196
- name: MATH
197
  metrics:
198
  - name: pass@1
199
  type: pass@1
200
  value: 8.96
201
  veriefied: false
 
202
  ---
203
 
204
  <!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png) -->
@@ -289,6 +290,11 @@ We train Granite 3.0 Language Models using IBM's super computing cluster, Blue V
289
  **Ethical Considerations and Limitations:**
290
  The use of Large Language Models involves risks and ethical considerations people must be aware of, including but not limited to: bias and fairness, misinformation, and autonomous decision-making. Granite-3.0-1B-A400M-Base model is not the exception in this regard. Even though this model is suited for multiple generative AI tasks, it has not undergone any safety alignment, there it may produce problematic outputs. Additionally, it remains uncertain whether smaller models might exhibit increased susceptibility to hallucination in generation scenarios by copying text verbatim from the training dataset due to their reduced sizes and memorization capacities. This aspect is currently an active area of research, and we anticipate more rigorous exploration, comprehension, and mitigations in this domain. Regarding ethics, a latent risk associated with all Large Language Models is their malicious utilization. We urge the community to use Granite-3.0-1B-A400M-Base model with ethical intentions and in a responsible way.
291
 
 
 
 
 
 
292
  <!-- ## Citation
293
  ```
294
  @misc{granite-models,
 
12
  - task:
13
  type: text-generation
14
  dataset:
15
+ type: human-exams
16
+ name: MMLU
17
  metrics:
18
  - name: pass@1
19
  type: pass@1
 
22
  - task:
23
  type: text-generation
24
  dataset:
25
+ type: human-exams
26
+ name: MMLU-Pro
27
  metrics:
28
  - name: pass@1
29
  type: pass@1
 
32
  - task:
33
  type: text-generation
34
  dataset:
35
+ type: human-exams
36
+ name: AGI-Eval
37
  metrics:
38
  - name: pass@1
39
  type: pass@1
 
42
  - task:
43
  type: text-generation
44
  dataset:
45
+ type: commonsense
46
+ name: WinoGrande
47
  metrics:
48
  - name: pass@1
49
  type: pass@1
 
52
  - task:
53
  type: text-generation
54
  dataset:
55
+ type: commonsense
56
+ name: OBQA
57
  metrics:
58
  - name: pass@1
59
  type: pass@1
60
+ value: 39
61
  veriefied: false
62
  - task:
63
  type: text-generation
64
  dataset:
65
+ type: commonsense
66
+ name: SIQA
67
  metrics:
68
  - name: pass@1
69
  type: pass@1
 
72
  - task:
73
  type: text-generation
74
  dataset:
75
+ type: commonsense
76
+ name: PIQA
77
  metrics:
78
  - name: pass@1
79
  type: pass@1
 
82
  - task:
83
  type: text-generation
84
  dataset:
85
+ type: commonsense
86
+ name: Hellaswag
87
  metrics:
88
  - name: pass@1
89
  type: pass@1
 
92
  - task:
93
  type: text-generation
94
  dataset:
95
+ type: commonsense
96
+ name: TruthfulQA
97
  metrics:
98
  - name: pass@1
99
  type: pass@1
 
102
  - task:
103
  type: text-generation
104
  dataset:
105
+ type: reading-comprehension
106
+ name: BoolQ
107
  metrics:
108
  - name: pass@1
109
  type: pass@1
 
112
  - task:
113
  type: text-generation
114
  dataset:
115
+ type: reading-comprehension
116
+ name: SQuAD 2.0
117
  metrics:
118
  - name: pass@1
119
  type: pass@1
 
122
  - task:
123
  type: text-generation
124
  dataset:
125
+ type: reasoning
126
+ name: ARC-C
127
  metrics:
128
  - name: pass@1
129
  type: pass@1
 
132
  - task:
133
  type: text-generation
134
  dataset:
135
+ type: reasoning
136
+ name: GPQA
137
  metrics:
138
  - name: pass@1
139
  type: pass@1
 
142
  - task:
143
  type: text-generation
144
  dataset:
145
+ type: reasoning
146
+ name: BBH
147
  metrics:
148
  - name: pass@1
149
  type: pass@1
 
152
  - task:
153
  type: text-generation
154
  dataset:
155
+ type: reasoning
156
+ name: MUSR
157
  metrics:
158
  - name: pass@1
159
  type: pass@1
 
162
  - task:
163
  type: text-generation
164
  dataset:
165
+ type: code
166
+ name: HumanEval
167
  metrics:
168
  - name: pass@1
169
  type: pass@1
 
172
  - task:
173
  type: text-generation
174
  dataset:
175
+ type: code
176
+ name: MBPP
177
  metrics:
178
  - name: pass@1
179
  type: pass@1
180
+ value: 23.2
181
+ veriefied: false
182
  - task:
183
  type: text-generation
184
  dataset:
185
+ type: math
186
+ name: GSM8K
187
  metrics:
188
  - name: pass@1
189
  type: pass@1
190
  value: 19.26
191
+ veriefied: false
192
  - task:
193
  type: text-generation
194
  dataset:
195
+ type: math
196
+ name: MATH
197
  metrics:
198
  - name: pass@1
199
  type: pass@1
200
  value: 8.96
201
  veriefied: false
202
+ new_version: ibm-granite/granite-3.1-1b-a400m-base
203
  ---
204
 
205
  <!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png) -->
 
290
  **Ethical Considerations and Limitations:**
291
  The use of Large Language Models involves risks and ethical considerations people must be aware of, including but not limited to: bias and fairness, misinformation, and autonomous decision-making. Granite-3.0-1B-A400M-Base model is not the exception in this regard. Even though this model is suited for multiple generative AI tasks, it has not undergone any safety alignment, there it may produce problematic outputs. Additionally, it remains uncertain whether smaller models might exhibit increased susceptibility to hallucination in generation scenarios by copying text verbatim from the training dataset due to their reduced sizes and memorization capacities. This aspect is currently an active area of research, and we anticipate more rigorous exploration, comprehension, and mitigations in this domain. Regarding ethics, a latent risk associated with all Large Language Models is their malicious utilization. We urge the community to use Granite-3.0-1B-A400M-Base model with ethical intentions and in a responsible way.
292
 
293
+ **Resources**
294
+ - ⭐️ Learn about the latest updates with Granite: https://www.ibm.com/granite
295
+ - 📄 Get started with tutorials, best practices, and prompt engineering advice: https://www.ibm.com/granite/docs/
296
+ - 💡 Learn about the latest Granite learning resources: https://ibm.biz/granite-learning-resources
297
+
298
  <!-- ## Citation
299
  ```
300
  @misc{granite-models,