kurcontko
/

QwQ-32B-Preview-bnb-4bit

Text Generation

4-bit precision

Model card Files Files and versions Community

QwQ-32B-Preview-bnb-4bit / README.md

kurcontko's picture

Update README.md

04fe3e6 verified about 2 months ago

|

history blame contribute delete

749 Bytes

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- Qwen/QwQ-32B-Preview
	pipeline_tag: text-generation
	tags:
	- chat
	- qwen2
	---
	# QwQ-32B-Preview-bnb-4bit

	## Introduction

	QwQ-32B-Preview-bnb-4bit is a 4-bit quantized version of the [QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview) model, utilizing the Bits and Bytes (bnb) quantization technique. This quantization significantly reduces the model's size and inference latency, making it more accessible for deployment on resource-constrained hardware.

	## Model Details

	- Quantization: 4-bit using Bits and Bytes (bnb)
	- Base Model: [Qwen/QwQ-32B-Preview](https://huggingface.co/Qwen/QwQ-32B-Preview)
	- Parameters: 32.5 billion
	- Context Length: Up to 32,768 tokens