kobkrit commited on
Commit
157f7e4
·
verified ·
1 Parent(s): 08dfbd4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -270,9 +270,14 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
270
 
271
  2. Run server
272
  ```bash
273
- vllm serve openthaigpt/openthaigpt1.5-72b-instruct --tensor-parallel-size 4
274
  ```
275
  * Note, change ``--tensor-parallel-size 4`` to the amount of available GPU cards.
 
 
 
 
 
276
 
277
  3. Run inference (CURL example)
278
  ```bash
 
270
 
271
  2. Run server
272
  ```bash
273
+ vllm serve openthaigpt/openthaigpt1.5-72b-instruct --tensor-parallel-size 4
274
  ```
275
  * Note, change ``--tensor-parallel-size 4`` to the amount of available GPU cards.
276
+
277
+ If you wish to enable tool calling feature, add ``--enable-auto-tool-choice --tool-call-parser hermes`` into command. e.g.,
278
+ ```bash
279
+ vllm serve openthaigpt/openthaigpt1.5-72b-instruct --tensor-parallel-size 4 --enable-auto-tool-choice --tool-call-parser hermes
280
+ ```
281
 
282
  3. Run inference (CURL example)
283
  ```bash