Tool to GGUF conversion
Hello
@bartowski
!
Thank you for sharing models.
May you share what tool did you use for conversion to GGUF? If it is not a secret. llama.cpp's qwen2_vl_surgery gives model of clip architecture and KV pairs are different.
Thank you!
I used the qwen2_vl_surgery so that's interesting.. can you confirm the command you ran? I'll check if maybe it's been updated since I made this
I tried different commits... let say
https://github.com/ggerganov/llama.cpp/blob/b4327/examples/llava/qwen2_vl_surgery.py
Command to convertexport PYTHONPATH=$PYTHONPATH:$(pwd)/gguf-py;python3 examples/llava/qwen2_vl_surgery.py ../qwen2vl_model/
When I load your model to llama-server I see at logs
llama_model_loader: loaded meta data with 33 key-value pairs and 339 tensors from Qwen2-VL-7B-Instruct-f16.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen2vl
llama_model_loader: - kv 1: general.type str = model
and it's fine, I can use it. But when I try to load model that I converted I get
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = clip
llama_model_loader: - kv 1: general.description str = image encoder for Qwen2VL
and loading fails. What is interesting is that my output is in compliance with code of qwen2_vl_surgery.py:
What branch or commit do you use? Probably I missed some step?
Did you then also do the other conversions or are you attempting to ONLY load the clip file?
You need both the converted model as well as the mmproj conversion, it's two separate commands, the surgery and then the normal conversion procedure
Thank you much @bartowski ! It was helpful.