nintwentydo
/

pixtral-12b-FP8-dynamic-FP8-KV-cache

Image-Text-to-Text

compressed-tensors

Model card Files Files and versions Community

nintwentydo commited on 9 days ago

Commit

1ff7a22

•

1 Parent(s): 9f2abb4

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -35,4 +35,6 @@ Example VLLM usage
 vllm serve nintwentydo/pixtral-12b-FP8-dynamic-FP8-KV-cache --quantization fp8 --kv-cache-dtype fp8
 ```
-Supported on Nvidia GPUs with compute capability > 8.9 (Ada Lovelace, Hopper).

 vllm serve nintwentydo/pixtral-12b-FP8-dynamic-FP8-KV-cache --quantization fp8 --kv-cache-dtype fp8
 ```
+Supported on Nvidia GPUs with compute capability > 8.9 (Ada Lovelace, Hopper).
+**Edit:** Something seems to be wrong with the tokenizer. If you have any issues add `--tokenizer mistral-community/pixtral-12b` to your VLLM command line args.