kurcontko commited on
Commit
45c53f2
·
verified ·
1 Parent(s): 71a262f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -0
README.md ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # QwQ-32B-Preview-bnb-4bit
2
+
3
+ ## Introduction
4
+
5
+ QwQ-32B-Preview-bnb-4bit is a 4-bit quantized version of the [QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview) model, utilizing the Bits and Bytes (bnb) quantization technique. This quantization significantly reduces the model's size and inference latency, making it more accessible for deployment on resource-constrained hardware.
6
+
7
+ ## Model Details
8
+
9
+ - **Quantization:** 4-bit using Bits and Bytes (bnb)
10
+ - **Base Model:** [Qwen/QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview)
11
+ - **Parameters:** 32.5 billion
12
+ - **Context Length:** Up to 32,768 tokens