FileNotFoundError - the tokenizer.model file could not be found
Thank you for providing the gguf version of the model. I followed your suggestion and utilized make-ggml.py as instructed earlier. However, I encountered an error: FileNotFoundError - the tokenizer.model file could not be found. It seems that this file is not present in their directory. Can you confirm if you used the same make-ggml.py script as previously recommended? If affirmative, could you please specify where you obtained the tokenizer.model file?
I attempted to perform inference using llama.cpp for your model following the instructions provided in the README. However, I encountered an error related to the tokenizer. I am uncertain whether I overlooked something in the process or if llama.cpp is incompatible with this model type. The specific error message is as follows:
...
llama_model_loader: - type f32: 121 tensors
llama_model_loader: - type q4_K: 361 tensors
llama_model_loader: - type q6_K: 61 tensors
ERROR: byte not found in vocab: '
'
Segmentation fault (core dumped)
I used next commands:
huggingface-cli download TheBloke/AquilaChat2-34B-GGUF aquilachat2-34b.Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
./main -ngl 32 -m aquilachat2-34b.Q4_K_M.gguf --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "System: A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.\nHuman: {prompt}\nAssistant:"
Make sure you're using the latest version of llama.cpp - earlier versions had the bug you describe.