kurcontko
/

QwQ-32B-Preview-bnb-4bit

Text Generation

4-bit precision

Model card Files Files and versions Community

kurcontko commited on Nov 30, 2024

Commit

45c53f2

·

verified ·

1 Parent(s): 71a262f

Create README.md

Files changed (1) hide show

README.md +12 -0

README.md ADDED Viewed

	@@ -0,0 +1,12 @@

+# QwQ-32B-Preview-bnb-4bit
+## Introduction
+QwQ-32B-Preview-bnb-4bit is a 4-bit quantized version of the [QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview) model, utilizing the Bits and Bytes (bnb) quantization technique. This quantization significantly reduces the model's size and inference latency, making it more accessible for deployment on resource-constrained hardware.
+## Model Details
+- **Quantization:** 4-bit using Bits and Bytes (bnb)
+- **Base Model:** [Qwen/QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview)
+- **Parameters:** 32.5 billion
+- **Context Length:** Up to 32,768 tokens