RichardErkhov commited on
Commit
f756e19
·
verified ·
1 Parent(s): 68edfe6

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +196 -0
README.md ADDED
@@ -0,0 +1,196 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ granite-3b-code-instruct-128k - AWQ
11
+ - Model creator: https://huggingface.co/ibm-granite/
12
+ - Original model: https://huggingface.co/ibm-granite/granite-3b-code-instruct-128k/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ pipeline_tag: text-generation
20
+ inference: false
21
+ license: apache-2.0
22
+ datasets:
23
+ - bigcode/commitpackft
24
+ - TIGER-Lab/MathInstruct
25
+ - meta-math/MetaMathQA
26
+ - glaiveai/glaive-code-assistant-v3
27
+ - glaive-function-calling-v2
28
+ - bugdaryan/sql-create-context-instruction
29
+ - garage-bAInd/Open-Platypus
30
+ - nvidia/HelpSteer
31
+ - bigcode/self-oss-instruct-sc2-exec-filter-50k
32
+ metrics:
33
+ - code_eval
34
+ library_name: transformers
35
+ tags:
36
+ - code
37
+ - granite
38
+ model-index:
39
+ - name: granite-3b-code-instruct-128k
40
+ results:
41
+ - task:
42
+ type: text-generation
43
+ dataset:
44
+ type: bigcode/humanevalpack
45
+ name: HumanEvalSynthesis (Python)
46
+ metrics:
47
+ - name: pass@1
48
+ type: pass@1
49
+ value: 53.7
50
+ verified: false
51
+ - task:
52
+ type: text-generation
53
+ dataset:
54
+ type: bigcode/humanevalpack
55
+ name: HumanEvalSynthesis (Average)
56
+ metrics:
57
+ - name: pass@1
58
+ type: pass@1
59
+ value: 41.4
60
+ verified: false
61
+ - task:
62
+ type: text-generation
63
+ dataset:
64
+ type: bigcode/humanevalpack
65
+ name: HumanEvalExplain (Average)
66
+ metrics:
67
+ - name: pass@1
68
+ type: pass@1
69
+ value: 25.1
70
+ verified: false
71
+ - task:
72
+ type: text-generation
73
+ dataset:
74
+ type: bigcode/humanevalpack
75
+ name: HumanEvalFix (Average)
76
+ metrics:
77
+ - name: pass@1
78
+ type: pass@1
79
+ value: 26.2
80
+ verified: false
81
+ - task:
82
+ type: text-generation
83
+ dataset:
84
+ type: repoqa
85
+ name: RepoQA (Python@16K)
86
+ metrics:
87
+ - name: pass@1 (thresh=0.5)
88
+ type: pass@1 (thresh=0.5)
89
+ value: 48.0
90
+ verified: false
91
+ - task:
92
+ type: text-generation
93
+ dataset:
94
+ type: repoqa
95
+ name: RepoQA (C++@16K)
96
+ metrics:
97
+ - name: pass@1 (thresh=0.5)
98
+ type: pass@1 (thresh=0.5)
99
+ value: 36.0
100
+ verified: false
101
+ - task:
102
+ type: text-generation
103
+ dataset:
104
+ type: repoqa
105
+ name: RepoQA (Java@16K)
106
+ metrics:
107
+ - name: pass@1 (thresh=0.5)
108
+ type: pass@1 (thresh=0.5)
109
+ value: 38.0
110
+ verified: false
111
+ - task:
112
+ type: text-generation
113
+ dataset:
114
+ type: repoqa
115
+ name: RepoQA (TypeScript@16K)
116
+ metrics:
117
+ - name: pass@1 (thresh=0.5)
118
+ type: pass@1 (thresh=0.5)
119
+ value: 39.0
120
+ verified: false
121
+ - task:
122
+ type: text-generation
123
+ dataset:
124
+ type: repoqa
125
+ name: RepoQA (Rust@16K)
126
+ metrics:
127
+ - name: pass@1 (thresh=0.5)
128
+ type: pass@1 (thresh=0.5)
129
+ value: 29.0
130
+ verified: false
131
+ ---
132
+
133
+
134
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png)
135
+
136
+ # Granite-3B-Code-Instruct-128K
137
+
138
+ ## Model Summary
139
+ **Granite-3B-Code-Instruct-128K** is a 3B parameter long-context instruct model fine tuned from *Granite-3B-Code-Base-128K* on a combination of **permissively licensed** data used in training the original Granite code instruct models, in addition to synthetically generated code instruction datasets tailored for solving long context problems. By exposing the model to both short and long context data, we aim to enhance its long-context capability without sacrificing code generation performance at short input context.
140
+
141
+ - **Developers:** IBM Research
142
+ - **GitHub Repository:** [ibm-granite/granite-code-models](https://github.com/ibm-granite/granite-code-models)
143
+ - **Paper:** [Scaling Granite Code Models to 128K Context](https://arxiv.org/abs/2405.04324)
144
+ - **Release Date**: July 18th, 2024
145
+ - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
146
+
147
+ ## Usage
148
+ ### Intended use
149
+ The model is designed to respond to coding related instructions over long-conext input up to 128K length and can be used to build coding assistants.
150
+
151
+ <!-- TO DO: Check starcoder2 instruct code example that includes the template https://huggingface.co/bigcode/starcoder2-15b-instruct-v0.1 -->
152
+
153
+ ### Generation
154
+ This is a simple example of how to use **Granite-3B-Code-Instruct** model.
155
+
156
+ ```python
157
+ import torch
158
+ from transformers import AutoModelForCausalLM, AutoTokenizer
159
+ device = "cuda" # or "cpu"
160
+ model_path = "ibm-granite/granite-3b-code-instruct-128k"
161
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
162
+ # drop device_map if running on CPU
163
+ model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
164
+ model.eval()
165
+ # change input text as desired
166
+ chat = [
167
+ { "role": "user", "content": "Write a code to find the maximum value in a list of numbers." },
168
+ ]
169
+ chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
170
+ # tokenize the text
171
+ input_tokens = tokenizer(chat, return_tensors="pt")
172
+ # transfer tokenized inputs to the device
173
+ for i in input_tokens:
174
+ input_tokens[i] = input_tokens[i].to(device)
175
+ # generate output tokens
176
+ output = model.generate(**input_tokens, max_new_tokens=100)
177
+ # decode output tokens into text
178
+ output = tokenizer.batch_decode(output)
179
+ # loop over the batch to print, in this example the batch size is 1
180
+ for i in output:
181
+ print(i)
182
+ ```
183
+
184
+ <!-- TO DO: Check this part -->
185
+ ## Training Data
186
+ Granite Code Instruct models are trained on a mix of short and long context data as follows.
187
+ * Short-Context Instruction Data: [CommitPackFT](https://huggingface.co/datasets/bigcode/commitpackft), [BigCode-SC2-Instruct](bigcode/self-oss-instruct-sc2-exec-filter-50k), [MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct), [MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA), [Glaive-Code-Assistant-v3](https://huggingface.co/datasets/glaiveai/glaive-code-assistant-v3), [Glaive-Function-Calling-v2](https://huggingface.co/datasets/glaiveai/glaive-function-calling-v2), [NL2SQL11](https://huggingface.co/datasets/bugdaryan/sql-create-context-instruction), [HelpSteer](https://huggingface.co/datasets/nvidia/HelpSteer), [OpenPlatypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus) including a synthetically generated dataset for API calling and multi-turn code interactions with execution feedback. We also include a collection of hardcoded prompts to ensure our model generates correct outputs given inquiries about its name or developers.
188
+ * Long-Context Instruction Data: A synthetically-generated dataset by bootstrapping the repository-level file-packed documents through Granite-8b-Code-Instruct to improve long-context capability of the model.
189
+
190
+ ## Infrastructure
191
+ We train the Granite Code models using two of IBM's super computing clusters, namely Vela and Blue Vela, both outfitted with NVIDIA A100 and H100 GPUs respectively. These clusters provide a scalable and efficient infrastructure for training our models over thousands of GPUs.
192
+
193
+ ## Ethical Considerations and Limitations
194
+ Granite code instruct models are primarily finetuned using instruction-response pairs across a specific set of programming languages. Thus, their performance may be limited with out-of-domain programming languages. In this situation, it is beneficial providing few-shot examples to steer the model's output. Moreover, developers should perform safety testing and target-specific tuning before deploying these models on critical applications. The model also inherits ethical considerations and limitations from its base model. For more information, please refer to *[Granite-3B-Code-Base-128K](https://huggingface.co/ibm-granite/granite-3b-code-base-128k)* model card.
195
+
196
+