metadata

license: other
language:
  - en
library_name: transformers
tags:
  - RLHF
  - Nexusflow
  - Athene
  - Chat Model

Athene-V2-Chat-72B: Rivaling GPT-4o across Benchmarks

Nexusflow HF - Nexusflow Discord

We introduce Athene-V2-Chat-72B, an open-weights LLM that rivals GPT-4o across benchmarks. It is trained through RLHF based off Qwen-2.5-72B. Athene-V2-Chat-72B excels in chat, math and coding. Its sister model, Athene-V2-Agent-72B, surpasses GPT-4o in complex function calling and agent applications.

Benchmark performance:

Developed by: The Nexusflow Team
Model type: Chat Model
Finetuned from model: Qwen 2.5 72B
License: Nexusflow Research License
Blog: https://nexusflow.ai/blogs/athene-V2

Usage

Athene-V2-Chat uses the same chat template as Qwen 2.5 72B. Below is an example simple usage using the Transformers library.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Nexusflow/Athene-V2-Chat"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

We found that by adding system prompts that enforce the model to think step by step, the model can do even better in math and problems like counting rs in strawberry. For fairness consideration we do not include such system prompt during chat evaluation.

Acknowledgment

We would like to thank the LMSYS Organization for their support of testing the model. We would like to thank Qwen Team and the open source community for their efforts in providing the datasets and base models.