dewabrata commited on
Commit
14a084c
·
verified ·
1 Parent(s): d60436a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -5
README.md CHANGED
@@ -3,9 +3,9 @@ license: apache-2.0
3
  language:
4
  - id
5
  base_model:
6
- - meta-llama/Llama-3.1-70B
7
  pipeline_tag: text-generation
8
- library_name: adapter-transformers
9
  tags:
10
  - cerita
11
  - quantized
@@ -29,9 +29,9 @@ This is a quantized version of the LLaMA 70B model fine-tuned for generating cre
29
  ---
30
 
31
  ## Usage
32
- You can use this model for text generation tasks with the Hugging Face Transformers library.
33
 
34
- ### Example Code
35
  ```python
36
  from transformers import AutoModelForCausalLM, AutoTokenizer
37
 
@@ -51,6 +51,26 @@ outputs = model.generate(**inputs, max_new_tokens=500, temperature=0.7, top_p=0.
51
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
52
  ```
53
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
  ---
55
 
56
  ## Performance
@@ -85,7 +105,7 @@ If you use this model, please cite:
85
  ```bibtex
86
  @misc{dewabrata2024,
87
  author = {Dewabrata},
88
- title = {Cerita Panas - Quantized LLaMA 70B},
89
  year = {2024},
90
  howpublished = {\url{https://huggingface.co/dewabrata/cerita_seru_70B_quantized}},
91
  }
 
3
  language:
4
  - id
5
  base_model:
6
+ - meta-llama/Llama-3.1-70b
7
  pipeline_tag: text-generation
8
+ library_name: vllm
9
  tags:
10
  - cerita
11
  - quantized
 
29
  ---
30
 
31
  ## Usage
32
+ You can use this model for text generation tasks with the Hugging Face Transformers library or with [vLLM](https://github.com/vllm-project/vllm) for efficient inference.
33
 
34
+ ### Example Code with Transformers
35
  ```python
36
  from transformers import AutoModelForCausalLM, AutoTokenizer
37
 
 
51
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
52
  ```
53
 
54
+ ### Example Code with vLLM
55
+ ```python
56
+ from vllm import LLM, SamplingParams
57
+
58
+ # Load model with vLLM
59
+ model_name = "dewabrata/cerita_seru_70B_quantized"
60
+ llm = LLM(model_name)
61
+
62
+ # Generate text
63
+ prompt = "Ceritakan tentang Widya, seorang wanita berhijab yang bersemangat menjalani hidupnya dan memiliki bakat luar biasa dalam seni lukis."
64
+ sampling_params = SamplingParams(
65
+ temperature=0.7,
66
+ top_p=0.9,
67
+ max_tokens=500
68
+ )
69
+
70
+ outputs = llm.generate([prompt], sampling_params)
71
+ print(outputs[0].text)
72
+ ```
73
+
74
  ---
75
 
76
  ## Performance
 
105
  ```bibtex
106
  @misc{dewabrata2024,
107
  author = {Dewabrata},
108
+ title = {Ceritakan Tentang Widya - Quantized LLaMA 70B},
109
  year = {2024},
110
  howpublished = {\url{https://huggingface.co/dewabrata/cerita_seru_70B_quantized}},
111
  }