QwQ-32B-Preview LoRA for separating thinking/answer parts

This LoRA file was fine-tuned to make QwQ constantly separate its private thoughts from the final answer using <THINKING>...</THINKING><ANSWER>...</ANSWER> tags.

For best results, it's also recommended to add the following to the System Prompt:

Your private thoughts must be placed inside ... XML tags, and your final answer to the user must be placed inside ... XML tags. These tags MUST appear in all your responses.

This GGUF file can be used with Ollama as an adapter of the unsloth/QwQ-32B-Preview-GGUF quantized models. See the attached Modelfile for an example.

Downloads last month
8
GGUF
Model size
134M params
Architecture
qwen2

4-bit

Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for shakedzy/QwQ-32B-Preview-with-Tags-LoRA-GGUF

Base model

Qwen/Qwen2.5-32B
Adapter
(24)
this model