Requirements
pip install -U transformers autoawq
Transformers inference
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
dtype = torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float16
device = "auto"
model_name = "jakiAJK/DeepSeek-R1-Distill-Llama-8B_AWQ"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map= device, trust_remote_code= True, torch_dtype= dtype)
model.eval()
chat = [
{ "role": "user", "content": "List any 5 country capitals." },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
input_tokens = tokenizer(chat, return_tensors="pt").to('cuda')
output = model.generate(**input_tokens,
max_new_tokens=100)
output = tokenizer.batch_decode(output)
print(output)
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for jakiAJK/DeepSeek-R1-Distill-Llama-8B_AWQ
Base model
deepseek-ai/DeepSeek-R1-Distill-Llama-8B