|
--- |
|
|
|
base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B |
|
|
|
--- |
|
This is a quantization of the [DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B). |
|
|
|
DeepSeek's Qwen-distilled models are compact reasoning models derived from DeepSeek-R1, achieving exceptional performance by distilling larger model reasoning patterns into smaller architectures. Spanning from 1.5B to 70B parameters, the models are based on Qwen2.5 and Llama3, with the standout DeepSeek-R1-Distill-Qwen-32B outperforming OpenAI-o1-mini and setting new dense model benchmarks. By combining reinforcement learning (RL) and supervised fine-tuning (SFT), these open-source models provide a powerful resource for advancing research and practical applications. |
|
## Evaluations |
|
This model provides an accuracy recovery of 100.04%. |
|
|
|
| __English__ | __[DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B)__ | __[DeepSeek-R1-Distill-Qwen-32B-FP8-Dynamic (this)](https://huggingface.co/cortecs/DeepSeek-R1-Distill-Qwen-32B-FP8-Dynamic)__ | |
|
|:--------------|------------------------------------------------------------------------------------------------------:|---------------------------------------------------------------------------------------------------------------------------------:| |
|
| Avg. | 74.03 | 74.06 | |
|
| ARC | 68.2 | 68.9 | |
|
| Hellaswag | 74 | 73.7 | |
|
| MMLU | 79.88 | 79.57 | |
|
|
|
We did not check for data contamination. |
|
Evaluation was done using [Eval. Harness](https://github.com/EleutherAI/lm-evaluation-harness) with `limit=1000`. |
|
|
|
## Usage |
|
Install **vLLM** and |
|
run the [server](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#openai-compatible-server): |
|
|
|
``` |
|
python -m vllm.entrypoints.openai.api_server --model cortecs/DeepSeek-R1-Distill-Qwen-32B-FP8-Dynamic |
|
``` |
|
Access the model: |
|
``` |
|
curl http://localhost:8000/v1/completions -H "Content-Type: application/json" -d ' { |
|
"model": "cortecs/DeepSeek-R1-Distill-Qwen-32B-FP8-Dynamic", |
|
"prompt": "San Francisco is a" |
|
} ' |
|
``` |
|
|