anakin87
/

gemma-2b-orpo-GGUF

Inference Endpoints

Model card Files Files and versions Community

anakin87 commited on Apr 6, 2024

Commit

ae87dd4

·

1 Parent(s): ef807be

improve readme

Files changed (1) hide show

README.md +41 -2

README.md CHANGED Viewed

@@ -2,11 +2,50 @@
 license: other
 license_name: gemma-terms-of-use
 license_link: https://ai.google.dev/gemma/terms
-base_model: google/gemma-2b
 tags:
   - orpo
 datasets:
   - alvarobartt/dpo-mix-7k-simplified
 language:
   - en
----

 license: other
 license_name: gemma-terms-of-use
 license_link: https://ai.google.dev/gemma/terms
+base_model: anakin87/gemma-2b-orpo
 tags:
   - orpo
 datasets:
   - alvarobartt/dpo-mix-7k-simplified
 language:
   - en
+---
+<img src="https://huggingface.co/anakin87/gemma-2b-orpo/resolve/main/assets/gemma-2b-orpo.png" width="450"></img>
+# gemma-2b-orpo-GGUF
+This is a GGUF quantized version of the [`gemma-2b-orpo` model](https://huggingface.co/anakin87/gemma-2b-orpo/):
+an ORPO fine-tune of google/gemma-2b.
+You can find more information, including evaluation and training/usage notebook in the [`gemma-2b-orpo` model card](https://huggingface.co/anakin87/gemma-2b-orpo/)
+## 🎮 Model in action
+The model can run with all the libraries which are part of the Llama.cpp ecosystem.
+If you need to manually apply the prompt template, take a look at the [tokenizer_config.json of the original model](https://huggingface.co/anakin87/gemma-2b-orpo/blob/main/tokenizer_config.json).
+Here a simple example with **Llama.cpp python**:
+```python
+! pip install llama-cpp-python
+from llama_cpp import Llama
+llm = Llama.from_pretrained(
+    repo_id="anakin87/gemma-2b-orpo-GGUF",
+    filename="gemma-2b-orpo.Q5_K_M.gguf",
+    verbose=True # for a known bug, verbose must be True
+)
+# text generation - prompt template applied manually
+llm("<bos><|im_start|> user\nName the planets in the solar system<|im_end|>\n<|im_start|>assistant\n", max_tokens=75)
+# chat completion - prompt template automatically applied
+llm.create_chat_completion(
+      messages = [
+          {
+              "role": "user",
+              "content": "Please list some places to visit in Italy"
+          }
+      ]
+)
+```