dewabrata
/

cerita_seru_70B_quantized

Text Generation

8-bit precision

Model card Files Files and versions Community

dewabrata commited on 11 days ago

Commit

14a084c

·

verified ·

1 Parent(s): d60436a

Update README.md

Files changed (1) hide show

README.md +25 -5

README.md CHANGED Viewed

@@ -3,9 +3,9 @@ license: apache-2.0
 language:
 - id
 base_model:
-- meta-llama/Llama-3.1-70B
 pipeline_tag: text-generation
-library_name: adapter-transformers
 tags:
 - cerita
 - quantized
@@ -29,9 +29,9 @@ This is a quantized version of the LLaMA 70B model fine-tuned for generating cre
 ---
 ## Usage
-You can use this model for text generation tasks with the Hugging Face Transformers library.
-### Example Code
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -51,6 +51,26 @@ outputs = model.generate(**inputs, max_new_tokens=500, temperature=0.7, top_p=0.
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
 ---
 ## Performance
@@ -85,7 +105,7 @@ If you use this model, please cite:
 ```bibtex
 @misc{dewabrata2024,
   author = {Dewabrata},
-  title = {Cerita Panas - Quantized LLaMA 70B},
   year = {2024},
   howpublished = {\url{https://huggingface.co/dewabrata/cerita_seru_70B_quantized}},
 }

 language:
 - id
 base_model:
+- meta-llama/Llama-3.1-70b
 pipeline_tag: text-generation
+library_name: vllm
 tags:
 - cerita
 - quantized
 ---
 ## Usage
+You can use this model for text generation tasks with the Hugging Face Transformers library or with [vLLM](https://github.com/vllm-project/vllm) for efficient inference.
+### Example Code with Transformers
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
+### Example Code with vLLM
+```python
+from vllm import LLM, SamplingParams
+# Load model with vLLM
+model_name = "dewabrata/cerita_seru_70B_quantized"
+llm = LLM(model_name)
+# Generate text
+prompt = "Ceritakan tentang Widya, seorang wanita berhijab yang bersemangat menjalani hidupnya dan memiliki bakat luar biasa dalam seni lukis."
+sampling_params = SamplingParams(
+    temperature=0.7,
+    top_p=0.9,
+    max_tokens=500
+)
+outputs = llm.generate([prompt], sampling_params)
+print(outputs[0].text)
+```
 ---
 ## Performance
 ```bibtex
 @misc{dewabrata2024,
   author = {Dewabrata},
+  title = {Ceritakan Tentang Widya - Quantized LLaMA 70B},
   year = {2024},
   howpublished = {\url{https://huggingface.co/dewabrata/cerita_seru_70B_quantized}},
 }