--- datasets: - PowerInfer/QWQ-LONGCOT-500K - PowerInfer/LONGCOT-Refine-500K base_model: - Qwen/Qwen2.5-3B-Instruct pipeline_tag: text-generation language: - en library_name: transformers tags: - llama-cpp - imatrix - gguf - IQ4_XS - 3b - SmallThinker - qwen - llama-cpp - PowerInfer - code - math - chat - roleplay - text-generation - safetensors - nlp - code --- # roleplaiapp/SmallThinker-3B-Preview-IQ4_XS-GGUF **Repo:** `roleplaiapp/SmallThinker-3B-Preview-IQ4_XS-GGUF` **Original Model:** `SmallThinker-3B-Preview` **Organization:** `PowerInfer` **Quantized File:** `smallthinker-3b-preview-iq4_xs-imat.gguf` **Quantization:** `GGUF` **Quantization Method:** `IQ4_XS` **Use Imatrix:** `True` **Split Model:** `False` ## Overview This is an imatrix GGUF IQ4_XS quantized version of [SmallThinker-3B-Preview](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview). ## Quantization By I often have idle A100 GPUs while building/testing and training the RP app, so I put them to use quantizing models. I hope the community finds these quantizations useful. Andrew Webby @ [RolePlai](https://roleplai.app/)