license: other
language:
- en
library_name: transformers
tags:
- RLHF
- Nexusflow
- Athene
- Chat Model
Athene-V2-Chat-72B: Rivaling GPT-4o across Benchmarks
Nexusflow HF - Nexusflow Discord
We introduce Athene-V2-Chat-72B, an open-weights LLM that rivals GPT-4o across benchmarks. It is trained through RLHF based off Qwen-2.5-72B. Athene-V2-Chat-72B excels in chat, math and coding. Its sister model, Athene-V2-Agent-72B, surpasses GPT-4o in complex function calling and agent applications.
Benchmark performance:
- Developed by: The Nexusflow Team
- Model type: Chat Model
- Finetuned from model: Qwen 2.5 72B
- License: Nexusflow Research License
- Blog: https://nexusflow.ai/blogs/athene-V2
Usage
Athene-V2-Chat uses the same chat template as Qwen 2.5 72B. Below is an example simple usage using the Transformers library.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Nexusflow/Athene-V2-Chat"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
We found that by adding system prompts that enforce the model to think step by step, the model can do even better in math and problems like counting r
s in strawberry. For fairness consideration we do not include such system prompt during chat evaluation.
Acknowledgment
We would like to thank the LMSYS Organization for their support of testing the model. We would like to thank Qwen Team and the open source community for their efforts in providing the datasets and base models.