Files changed (1) hide show
  1. README.md +47 -55
README.md CHANGED
@@ -12,8 +12,8 @@ model-index:
12
  - task:
13
  type: text-generation
14
  dataset:
15
- type: human-exams
16
- name: MMLU
17
  metrics:
18
  - name: pass@1
19
  type: pass@1
@@ -22,8 +22,8 @@ model-index:
22
  - task:
23
  type: text-generation
24
  dataset:
25
- type: human-exams
26
- name: MMLU-Pro
27
  metrics:
28
  - name: pass@1
29
  type: pass@1
@@ -32,8 +32,8 @@ model-index:
32
  - task:
33
  type: text-generation
34
  dataset:
35
- type: human-exams
36
- name: AGI-Eval
37
  metrics:
38
  - name: pass@1
39
  type: pass@1
@@ -42,38 +42,38 @@ model-index:
42
  - task:
43
  type: text-generation
44
  dataset:
45
- type: commonsense
46
- name: WinoGrande
47
  metrics:
48
  - name: pass@1
49
  type: pass@1
50
- value: 80.9
51
  veriefied: false
52
  - task:
53
  type: text-generation
54
  dataset:
55
- type: commonsense
56
- name: OBQA
57
  metrics:
58
  - name: pass@1
59
  type: pass@1
60
- value: 46.8
61
  veriefied: false
62
  - task:
63
  type: text-generation
64
  dataset:
65
- type: commonsense
66
- name: SIQA
67
  metrics:
68
  - name: pass@1
69
  type: pass@1
70
- value: 67.8
71
  veriefied: false
72
  - task:
73
  type: text-generation
74
  dataset:
75
- type: commonsense
76
- name: PIQA
77
  metrics:
78
  - name: pass@1
79
  type: pass@1
@@ -82,8 +82,8 @@ model-index:
82
  - task:
83
  type: text-generation
84
  dataset:
85
- type: commonsense
86
- name: Hellaswag
87
  metrics:
88
  - name: pass@1
89
  type: pass@1
@@ -92,8 +92,8 @@ model-index:
92
  - task:
93
  type: text-generation
94
  dataset:
95
- type: commonsense
96
- name: TruthfulQA
97
  metrics:
98
  - name: pass@1
99
  type: pass@1
@@ -102,8 +102,8 @@ model-index:
102
  - task:
103
  type: text-generation
104
  dataset:
105
- type: reading-comprehension
106
- name: BoolQ
107
  metrics:
108
  - name: pass@1
109
  type: pass@1
@@ -112,8 +112,8 @@ model-index:
112
  - task:
113
  type: text-generation
114
  dataset:
115
- type: reading-comprehension
116
- name: SQuAD 2.0
117
  metrics:
118
  - name: pass@1
119
  type: pass@1
@@ -122,18 +122,18 @@ model-index:
122
  - task:
123
  type: text-generation
124
  dataset:
125
- type: reasoning
126
- name: ARC-C
127
  metrics:
128
  - name: pass@1
129
  type: pass@1
130
- value: 63.4
131
  veriefied: false
132
  - task:
133
  type: text-generation
134
  dataset:
135
- type: reasoning
136
- name: GPQA
137
  metrics:
138
  - name: pass@1
139
  type: pass@1
@@ -142,8 +142,8 @@ model-index:
142
  - task:
143
  type: text-generation
144
  dataset:
145
- type: reasoning
146
- name: BBH
147
  metrics:
148
  - name: pass@1
149
  type: pass@1
@@ -152,8 +152,8 @@ model-index:
152
  - task:
153
  type: text-generation
154
  dataset:
155
- type: reasoning
156
- name: MUSR
157
  metrics:
158
  - name: pass@1
159
  type: pass@1
@@ -162,8 +162,8 @@ model-index:
162
  - task:
163
  type: text-generation
164
  dataset:
165
- type: code
166
- name: HumanEval
167
  metrics:
168
  - name: pass@1
169
  type: pass@1
@@ -172,42 +172,39 @@ model-index:
172
  - task:
173
  type: text-generation
174
  dataset:
175
- type: code
176
- name: MBPP
177
  metrics:
178
  - name: pass@1
179
  type: pass@1
180
- value: 41.4
181
- veriefied: false
182
  - task:
183
  type: text-generation
184
  dataset:
185
- type: math
186
- name: GSM8K
187
  metrics:
188
  - name: pass@1
189
  type: pass@1
190
  value: 64.06
191
- veriefied: false
192
  - task:
193
  type: text-generation
194
  dataset:
195
- type: math
196
- name: MATH
197
  metrics:
198
  - name: pass@1
199
  type: pass@1
200
  value: 29.28
201
- veriefied: false
202
- new_version: ibm-granite/granite-3.1-8b-base
203
  ---
204
  <!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png) -->
205
  <!-- ![image/png](granite-3_0-language-models_Group_1.png) -->
206
 
207
  # Granite-3.0-8B-Base
208
 
209
- <!-- **Note: We are continuously improving our models and recommend users to checkout our latest [Granite 3.1](https://huggingface.co/collections/ibm-granite/granite-31-language-models-6751dbbf2f3389bec5c6f02d) models.** -->
210
-
211
  **Model Summary:**
212
  Granite-3.0-8B-Base is a decoder-only language model to support a variety of text-to-text generation tasks. It is trained from scratch following a two-stage training strategy. In the first stage, it is trained on 10 trillion tokens sourced from diverse domains. During the second stage, it is further trained on 2 trillion tokens using a carefully curated mix of high-quality data, aiming to enhance its performance on specific tasks.
213
 
@@ -291,11 +288,6 @@ We train Granite 3.0 Language Models using IBM's super computing cluster, Blue V
291
  **Ethical Considerations and Limitations:**
292
  The use of Large Language Models involves risks and ethical considerations people must be aware of, including but not limited to: bias and fairness, misinformation, and autonomous decision-making. Granite-3.0-8B-Base model is not the exception in this regard. Even though this model is suited for multiple generative AI tasks, it has not undergone any safety alignment, there it may produce problematic outputs. Additionally, it remains uncertain whether smaller models might exhibit increased susceptibility to hallucination in generation scenarios by copying text verbatim from the training dataset due to their reduced sizes and memorization capacities. This aspect is currently an active area of research, and we anticipate more rigorous exploration, comprehension, and mitigations in this domain. Regarding ethics, a latent risk associated with all Large Language Models is their malicious utilization. We urge the community to use Granite-3.0-8B-Base model with ethical intentions and in a responsible way.
293
 
294
- **Resources**
295
- - ⭐️ Learn about the latest updates with Granite: https://www.ibm.com/granite
296
- - 📄 Get started with tutorials, best practices, and prompt engineering advice: https://www.ibm.com/granite/docs/
297
- - 💡 Learn about the latest Granite learning resources: https://ibm.biz/granite-learning-resources
298
-
299
  <!-- ## Citation
300
  ```
301
  @misc{granite-models,
@@ -306,4 +298,4 @@ The use of Large Language Models involves risks and ethical considerations peopl
306
  year = {2024},
307
  url = {https://arxiv.org/abs/0000.00000},
308
  }
309
- ``` -->
 
12
  - task:
13
  type: text-generation
14
  dataset:
15
+ type: human-exams
16
+ name: MMLU
17
  metrics:
18
  - name: pass@1
19
  type: pass@1
 
22
  - task:
23
  type: text-generation
24
  dataset:
25
+ type: human-exams
26
+ name: MMLU-Pro
27
  metrics:
28
  - name: pass@1
29
  type: pass@1
 
32
  - task:
33
  type: text-generation
34
  dataset:
35
+ type: human-exams
36
+ name: AGI-Eval
37
  metrics:
38
  - name: pass@1
39
  type: pass@1
 
42
  - task:
43
  type: text-generation
44
  dataset:
45
+ type: commonsense
46
+ name: WinoGrande
47
  metrics:
48
  - name: pass@1
49
  type: pass@1
50
+ value: 80.90
51
  veriefied: false
52
  - task:
53
  type: text-generation
54
  dataset:
55
+ type: commonsense
56
+ name: OBQA
57
  metrics:
58
  - name: pass@1
59
  type: pass@1
60
+ value: 46.80
61
  veriefied: false
62
  - task:
63
  type: text-generation
64
  dataset:
65
+ type: commonsense
66
+ name: SIQA
67
  metrics:
68
  - name: pass@1
69
  type: pass@1
70
+ value: 67.80
71
  veriefied: false
72
  - task:
73
  type: text-generation
74
  dataset:
75
+ type: commonsense
76
+ name: PIQA
77
  metrics:
78
  - name: pass@1
79
  type: pass@1
 
82
  - task:
83
  type: text-generation
84
  dataset:
85
+ type: commonsense
86
+ name: Hellaswag
87
  metrics:
88
  - name: pass@1
89
  type: pass@1
 
92
  - task:
93
  type: text-generation
94
  dataset:
95
+ type: commonsense
96
+ name: TruthfulQA
97
  metrics:
98
  - name: pass@1
99
  type: pass@1
 
102
  - task:
103
  type: text-generation
104
  dataset:
105
+ type: reading-comprehension
106
+ name: BoolQ
107
  metrics:
108
  - name: pass@1
109
  type: pass@1
 
112
  - task:
113
  type: text-generation
114
  dataset:
115
+ type: reading-comprehension
116
+ name: SQuAD 2.0
117
  metrics:
118
  - name: pass@1
119
  type: pass@1
 
122
  - task:
123
  type: text-generation
124
  dataset:
125
+ type: reasoning
126
+ name: ARC-C
127
  metrics:
128
  - name: pass@1
129
  type: pass@1
130
+ value: 63.40
131
  veriefied: false
132
  - task:
133
  type: text-generation
134
  dataset:
135
+ type: reasoning
136
+ name: GPQA
137
  metrics:
138
  - name: pass@1
139
  type: pass@1
 
142
  - task:
143
  type: text-generation
144
  dataset:
145
+ type: reasoning
146
+ name: BBH
147
  metrics:
148
  - name: pass@1
149
  type: pass@1
 
152
  - task:
153
  type: text-generation
154
  dataset:
155
+ type: reasoning
156
+ name: MUSR
157
  metrics:
158
  - name: pass@1
159
  type: pass@1
 
162
  - task:
163
  type: text-generation
164
  dataset:
165
+ type: code
166
+ name: HumanEval
167
  metrics:
168
  - name: pass@1
169
  type: pass@1
 
172
  - task:
173
  type: text-generation
174
  dataset:
175
+ type: code
176
+ name: MBPP
177
  metrics:
178
  - name: pass@1
179
  type: pass@1
180
+ value: 41.40
181
+ veriefied: false
182
  - task:
183
  type: text-generation
184
  dataset:
185
+ type: math
186
+ name: GSM8K
187
  metrics:
188
  - name: pass@1
189
  type: pass@1
190
  value: 64.06
191
+ veriefied: false
192
  - task:
193
  type: text-generation
194
  dataset:
195
+ type: math
196
+ name: MATH
197
  metrics:
198
  - name: pass@1
199
  type: pass@1
200
  value: 29.28
201
+ veriefied: false
 
202
  ---
203
  <!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png) -->
204
  <!-- ![image/png](granite-3_0-language-models_Group_1.png) -->
205
 
206
  # Granite-3.0-8B-Base
207
 
 
 
208
  **Model Summary:**
209
  Granite-3.0-8B-Base is a decoder-only language model to support a variety of text-to-text generation tasks. It is trained from scratch following a two-stage training strategy. In the first stage, it is trained on 10 trillion tokens sourced from diverse domains. During the second stage, it is further trained on 2 trillion tokens using a carefully curated mix of high-quality data, aiming to enhance its performance on specific tasks.
210
 
 
288
  **Ethical Considerations and Limitations:**
289
  The use of Large Language Models involves risks and ethical considerations people must be aware of, including but not limited to: bias and fairness, misinformation, and autonomous decision-making. Granite-3.0-8B-Base model is not the exception in this regard. Even though this model is suited for multiple generative AI tasks, it has not undergone any safety alignment, there it may produce problematic outputs. Additionally, it remains uncertain whether smaller models might exhibit increased susceptibility to hallucination in generation scenarios by copying text verbatim from the training dataset due to their reduced sizes and memorization capacities. This aspect is currently an active area of research, and we anticipate more rigorous exploration, comprehension, and mitigations in this domain. Regarding ethics, a latent risk associated with all Large Language Models is their malicious utilization. We urge the community to use Granite-3.0-8B-Base model with ethical intentions and in a responsible way.
290
 
 
 
 
 
 
291
  <!-- ## Citation
292
  ```
293
  @misc{granite-models,
 
298
  year = {2024},
299
  url = {https://arxiv.org/abs/0000.00000},
300
  }
301
+ ``` -->