Update README.md
Browse files
README.md
CHANGED
@@ -39,44 +39,6 @@ The model is designed to respond to general instructions and can be used to buil
|
|
39 |
* Multilingual dialog use cases
|
40 |
* Long-context tasks including long document/meeting summarization, long document QA, etc.
|
41 |
|
42 |
-
**Generation:**
|
43 |
-
This is a simple example of how to use Granite-3.1-8B-Instruct model.
|
44 |
-
|
45 |
-
Install the following libraries:
|
46 |
-
|
47 |
-
```shell
|
48 |
-
pip install torch torchvision torchaudio
|
49 |
-
pip install accelerate
|
50 |
-
pip install transformers
|
51 |
-
```
|
52 |
-
Then, copy the snippet from the section that is relevant for your use case.
|
53 |
-
|
54 |
-
```python
|
55 |
-
import torch
|
56 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer
|
57 |
-
|
58 |
-
device = "auto"
|
59 |
-
model_path = "ibm-granite/granite-3.1-8b-instruct"
|
60 |
-
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
61 |
-
# drop device_map if running on CPU
|
62 |
-
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
|
63 |
-
model.eval()
|
64 |
-
# change input text as desired
|
65 |
-
chat = [
|
66 |
-
{ "role": "user", "content": "Please list one IBM Research laboratory located in the United States. You should only output its name and location." },
|
67 |
-
]
|
68 |
-
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
|
69 |
-
# tokenize the text
|
70 |
-
input_tokens = tokenizer(chat, return_tensors="pt").to(device)
|
71 |
-
# generate output tokens
|
72 |
-
output = model.generate(**input_tokens,
|
73 |
-
max_new_tokens=100)
|
74 |
-
# decode output tokens into text
|
75 |
-
output = tokenizer.batch_decode(output)
|
76 |
-
# print output
|
77 |
-
print(output)
|
78 |
-
```
|
79 |
-
|
80 |
**Model Architecture:**
|
81 |
Granite-3.1-8B-Instruct is based on a decoder-only dense transformer architecture. Core components of this architecture are: GQA and RoPE, MLP with SwiGLU, RMSNorm, and shared input/output embeddings.
|
82 |
|
|
|
39 |
* Multilingual dialog use cases
|
40 |
* Long-context tasks including long document/meeting summarization, long document QA, etc.
|
41 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
**Model Architecture:**
|
43 |
Granite-3.1-8B-Instruct is based on a decoder-only dense transformer architecture. Core components of this architecture are: GQA and RoPE, MLP with SwiGLU, RMSNorm, and shared input/output embeddings.
|
44 |
|