Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
# For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
|
3 |
+
# Doc / guide: https://huggingface.co/docs/hub/model-cards
|
4 |
+
{}
|
5 |
+
---
|
6 |
+
|
7 |
+
# Model Card for Model ID
|
8 |
+
|
9 |
+
<!-- Provide a quick summary of what the model is/does. -->
|
10 |
+
|
11 |
+
Open Thai GPT 13b
|
12 |
+
|
13 |
+
Prompt format is Llama2
|
14 |
+
```
|
15 |
+
<s>[INST] <<SYS>>
|
16 |
+
system_prompt
|
17 |
+
<</SYS>>
|
18 |
+
|
19 |
+
question [/INST]
|
20 |
+
```
|
21 |
+
|
22 |
+
System prompt:
|
23 |
+
|
24 |
+
You are a question answering assistant. Answer the question as truthful and helpful as possible คุณคือผู้ช่วยตอบคำถาม จงตอบคำถามอย่างถูกต้องและมีประโยชน์ที่สุด
|
25 |
+
|
26 |
+
## How to use
|
27 |
+
|
28 |
+
1. install VLLM (https://github.com/vllm-project/vllm)
|
29 |
+
2. python -m vllm.entrypoints.api_server --model /path/to/model --tensor-parallel-size num_gpus
|
30 |
+
3. run inference (CURL example)
|
31 |
+
|
32 |
+
```
|
33 |
+
curl --request POST \
|
34 |
+
--url http://localhost:8000/generate \
|
35 |
+
--header "Content-Type: application/json" \
|
36 |
+
--data '{"prompt": "<s>[INST] <<SYS>>\nYou are a question answering assistant. Answer the question as truthful and helpful as possible คุณคือผู้ช่วยตอบคำถาม จงตอบคำถามอย่างถูกต้องและมีประโยชน์ที่สุด\n<</SYS>>\n\nอยากลดความอ้วนต้องทำอย่างไร [/INST]","use_beam_search": false, "temperature": 0.1, "max_tokens": 512, "top_p": 0.75, "top_k": 40, "frequency_penalty": 0.3 "stop": "</s>"}'
|
37 |
+
```
|