Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,55 @@ language:
|
|
5 |
pipeline_tag: text-generation
|
6 |
---
|
7 |
|
8 |
-
# OLMo
|
9 |
|
10 |
-
|
11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
pipeline_tag: text-generation
|
6 |
---
|
7 |
|
8 |
+
# OLMo 7B-Instruct-GGUF
|
9 |
|
10 |
+
> For more details on OLMO-7B-Instruct, refer to [Allen AI's OLMo-7B-Instruct model card](https://huggingface.co/allenai/OLMo-7B-Instruct).
|
11 |
+
|
12 |
+
OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
|
13 |
+
The OLMo base models are trained on the [Dolma](https://huggingface.co/datasets/allenai/dolma) dataset.
|
14 |
+
The Instruct version is trained on the [cleaned version of the UltraFeedback dataset](https://huggingface.co/datasets/allenai/ultrafeedback_binarized_cleaned).
|
15 |
+
|
16 |
+
OLMo 7B Instruct is trained for better question answering. They show the performance gain that OLMo base models can achieve with existing fine-tuning techniques.
|
17 |
+
|
18 |
+
This version of the model is derived from [ssec-uw/OLMo-7B-Instruct-hf](https://huggingface.co/ssec-uw/OLMo-7B-Instruct-hf) as [GGUF format](https://huggingface.co/docs/hub/en/gguf),
|
19 |
+
a binary format that is optimized for quick loading and saving of models, making it highly efficient for inference purposes.
|
20 |
+
|
21 |
+
In addition to the model being in GGUF format, the model has been [quantized](https://huggingface.co/docs/optimum/en/concept_guides/quantization),
|
22 |
+
to reduce the computational and memory costs of running inference. *We are currently working on adding all of the [Quantization Types](https://huggingface.co/docs/hub/en/gguf#quantization-types)*.
|
23 |
+
|
24 |
+
These files are designed for use with [GGML](https://ggml.ai/) and executors based on GGML such as [llama.cpp](https://github.com/ggerganov/llama.cpp).
|
25 |
+
|
26 |
+
## Get Started
|
27 |
+
|
28 |
+
To get started using one of the GGUF file, you can simply use [llama-cpp-python](https://github.com/abetlen/llama-cpp-python),
|
29 |
+
a Python binding for `llama.cpp`.
|
30 |
+
|
31 |
+
1. Install `llama-cpp-python` with pip.
|
32 |
+
|
33 |
+
```bash
|
34 |
+
pip install llama-cpp-python
|
35 |
+
```
|
36 |
+
|
37 |
+
2. Download one of the GGUF file. In this example,
|
38 |
+
we will download the [OLMo-7B-Instruct-Q4_K_M.gguf](https://huggingface.co/ssec-uw/OLMo-7B-Instruct-GGUF/resolve/main/OLMo-7B-Instruct-Q4_K_M.gguf?download=true),
|
39 |
+
when the link is clicked.
|
40 |
+
|
41 |
+
3. Open up a python interpreter and run the following commands.
|
42 |
+
For example, we can ask it: `What is a solar system?`.
|
43 |
+
*You will need to modify the `model_path` argument to where
|
44 |
+
the GGUF model has been saved in your system*
|
45 |
+
|
46 |
+
```python
|
47 |
+
from llama_cpp import Llama
|
48 |
+
llm = Llama(
|
49 |
+
model_path="path/to/OLMo-7B-Instruct-Q4_K_M.gguf"
|
50 |
+
)
|
51 |
+
result_dict = llm(prompt="What is solar system?", echo=True, max_tokens=500)
|
52 |
+
print(result_dict['choices'][0]['text'])
|
53 |
+
```
|
54 |
+
|
55 |
+
5. That's it, you should see the result fairly quickly! Have fun! 🤖
|
56 |
+
|
57 |
+
## Contact
|
58 |
+
|
59 |
+
For errors in this model card, contact Don or Anant, {landungs, anmittal} at uw dot edu.
|