How to convert GGUF
#11
by
Jasper17
- opened
Can I ask you how to achieve the conversion? Why do I get the error shown below?
(llm_venv_llamacpp) xlab@xlab:/mnt/Model/MistralAI/llm_llamacpp$ python convert_hf_to_gguf.py /mnt/Model/MistralAI/Mistral-Large-Instruct-2407 --outfile ../llm_quantized/mistral_large2_instruct_f16.gguf --outtype f16 --no-lazy
INFO:hf-to-gguf:Loading model: Mistral-Large-Instruct-2407
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 131072
INFO:hf-to-gguf:gguf: embedding length = 12288
INFO:hf-to-gguf:gguf: feed forward length = 28672
INFO:hf-to-gguf:gguf: head count = 96
INFO:hf-to-gguf:gguf: key-value head count = 8
INFO:hf-to-gguf:gguf: rope theta = 1000000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.vocab:Setting special token type bos to 1
INFO:gguf.vocab:Setting special token type eos to 2
INFO:gguf.vocab:Setting special token type unk to 0
INFO:gguf.vocab:Setting add_bos_token to True
INFO:gguf.vocab:Setting add_eos_token to False
INFO:gguf.vocab:Setting chat_template to {%- if messages[0]['role'] == 'system' %}
{%- set system_message = messages[0]['content'] %}
{%- set loop_messages = messages[1:] %}
{%- else %}
{%- set loop_messages = messages %}
{%- endif %}
{{- bos_token }}
{%- for message in loop_messages %}
{%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}
{{- raise_exception('After the optional system message, conversation roles must alternate user/assistant/user/assistant/...') }}
{%- endif %}
{%- if message['role'] == 'user' %}
{%- if loop.last and system_message is defined %}
{{- '[INST] ' + system_message + '\n\n' + message['content'] + '[/INST]' }}
{%- else %}
{{- '[INST] ' + message['content'] + '[/INST]' }}
{%- endif %}
{%- elif message['role'] == 'assistant' %}
{{- ' ' + message['content'] + eos_token}}
{%- else %}
{{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }}
{%- endif %}
{%- endfor %}
INFO:hf-to-gguf:Set model quantization version
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:../llm_quantized/mistral_large2_instruct_f16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:hf-to-gguf:Model successfully exported to ../llm_quantized/mistral_large2_instruct_f16.gguf
Hi,
Where is the error? Model successfully exported to ../llm_quantized/mistral_large2_instruct_f16.gguf
means it finished converting with no error.
The error comes from the last two lines, you can see that there is no data written, and the generated gguf file is only a few hundred kilobytes, may I ask if you have changed some parameter of llamacpp during quantisation?
INFO:gguf.gguf_writer:../llm_quantized/mistral_large2_instruct_f16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
I would pull the latest llama.cpp changes from the git and make sure you do make clean
and make
to have all the latest chagnes