Athene-70B-Preview / README.md
banghua's picture
Update README.md
32d98c3 verified
---
license: cc-by-nc-4.0
language:
- en
library_name: transformers
tags:
- RLHF
- Nexusflow
- Athene
- Chat Model
---
# Athene-Llama3-70B
## Warning. This repo is a copy of the initial release of Athene-70B. Please refer to the link [here](https://huggingface.co/Nexusflow/Athene-70B) for the latest version.
We introduce Athene-Llama3-70B, an open-weights LLM trained through RLHF based off Llama-3-70B-Instruct. Athene-70B achieves a high score on Arena-Hard-Auto, a proxy benchmark for Chatbot Arena.
- **Developed by:** The Nexusflow Team (Evan Frick\*, Peter Jin\*, Tianle Li\*, Karthik Ganesan, Jian Zhang, Jiantao Jiao and Banghua Zhu).
- **Model type:** Chat Model
- **Finetuned from model:** [Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct).
Blog: https://nexusflow.ai/blogs/athene
| Model | Arena-Hard |
|---------------------------------|------------|
| Claude-3.5-Sonnet (Proprietary) | 79.3% |
| GPT-4o (Proprietary) | 79.2% |
| **Athene-70B (Open)** | 77.8% |
| Gemini-Pro-1.5 (Proprietary) | 72.0% |
| Gemma-2-27B (Open) | 57.0% |
| Llama-3-70B (Open) | 46.6% |
## Usage
Athene-70B uses the same chat template as Llama-3-70B-Instruct. Below is an example simple usage using the Transformers library.
```Python
import transformers
import torch
model_id = "Nexusflow/Athene-70B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "system", "content": "You are an Athene Noctura, you can only speak with owl sounds. Whoooo whooo."},
{"role": "user", "content": "Whooo are you?"},
]
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|end_of_text|>")
]
outputs = pipeline(
messages,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
print(outputs[0]["generated_text"][-1])
```
## Acknowledgment
We would like to thank the [LMSYS Organization](https://lmsys.org/) for their support of testing the model. We would like to thank Meta AI and the open source community for their efforts in providing the datasets and base models.
## Citation
```
@misc{Athene2024,
title = {Athene-70B: Redefining the Boundaries of Post-Training for Open Models},
url = {https://nexusflow.ai/blogs/athene},
author = {Frick, Evan and Jin, Peter and Li, Tianle and Ganesan, Karthik and Zhang, Jian and Jiao, Jiantao and Zhu, Banghua},
month = {July},
year = {2024}
}
```