Update README.md
Browse files
README.md
CHANGED
@@ -3,9 +3,9 @@ license: apache-2.0
|
|
3 |
language:
|
4 |
- id
|
5 |
base_model:
|
6 |
-
- meta-llama/Llama-3.1-
|
7 |
pipeline_tag: text-generation
|
8 |
-
library_name:
|
9 |
tags:
|
10 |
- cerita
|
11 |
- quantized
|
@@ -29,9 +29,9 @@ This is a quantized version of the LLaMA 70B model fine-tuned for generating cre
|
|
29 |
---
|
30 |
|
31 |
## Usage
|
32 |
-
You can use this model for text generation tasks with the Hugging Face Transformers library.
|
33 |
|
34 |
-
### Example Code
|
35 |
```python
|
36 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
37 |
|
@@ -51,6 +51,26 @@ outputs = model.generate(**inputs, max_new_tokens=500, temperature=0.7, top_p=0.
|
|
51 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
52 |
```
|
53 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
54 |
---
|
55 |
|
56 |
## Performance
|
@@ -85,7 +105,7 @@ If you use this model, please cite:
|
|
85 |
```bibtex
|
86 |
@misc{dewabrata2024,
|
87 |
author = {Dewabrata},
|
88 |
-
title = {
|
89 |
year = {2024},
|
90 |
howpublished = {\url{https://huggingface.co/dewabrata/cerita_seru_70B_quantized}},
|
91 |
}
|
|
|
3 |
language:
|
4 |
- id
|
5 |
base_model:
|
6 |
+
- meta-llama/Llama-3.1-70b
|
7 |
pipeline_tag: text-generation
|
8 |
+
library_name: vllm
|
9 |
tags:
|
10 |
- cerita
|
11 |
- quantized
|
|
|
29 |
---
|
30 |
|
31 |
## Usage
|
32 |
+
You can use this model for text generation tasks with the Hugging Face Transformers library or with [vLLM](https://github.com/vllm-project/vllm) for efficient inference.
|
33 |
|
34 |
+
### Example Code with Transformers
|
35 |
```python
|
36 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
37 |
|
|
|
51 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
52 |
```
|
53 |
|
54 |
+
### Example Code with vLLM
|
55 |
+
```python
|
56 |
+
from vllm import LLM, SamplingParams
|
57 |
+
|
58 |
+
# Load model with vLLM
|
59 |
+
model_name = "dewabrata/cerita_seru_70B_quantized"
|
60 |
+
llm = LLM(model_name)
|
61 |
+
|
62 |
+
# Generate text
|
63 |
+
prompt = "Ceritakan tentang Widya, seorang wanita berhijab yang bersemangat menjalani hidupnya dan memiliki bakat luar biasa dalam seni lukis."
|
64 |
+
sampling_params = SamplingParams(
|
65 |
+
temperature=0.7,
|
66 |
+
top_p=0.9,
|
67 |
+
max_tokens=500
|
68 |
+
)
|
69 |
+
|
70 |
+
outputs = llm.generate([prompt], sampling_params)
|
71 |
+
print(outputs[0].text)
|
72 |
+
```
|
73 |
+
|
74 |
---
|
75 |
|
76 |
## Performance
|
|
|
105 |
```bibtex
|
106 |
@misc{dewabrata2024,
|
107 |
author = {Dewabrata},
|
108 |
+
title = {Ceritakan Tentang Widya - Quantized LLaMA 70B},
|
109 |
year = {2024},
|
110 |
howpublished = {\url{https://huggingface.co/dewabrata/cerita_seru_70B_quantized}},
|
111 |
}
|