|
--- |
|
license: apache-2.0 |
|
--- |
|
<div align="center"> |
|
|
|
<picture> |
|
<img alt="specify theme context for images" src="https://raw.githubusercontent.com/01-ai/Yi/main/assets/img/Yi_logo_icon_light.svg" width="150px"> |
|
</picture> |
|
|
|
</div> |
|
|
|
<p align="center"> |
|
<a href="https://github.com/01-ai/Yi-1.5">π GitHub</a> β’ |
|
<a href="https://discord.gg/hYUwWddeAu">πΎ Discord</a> β’ |
|
<a href="https://twitter.com/01ai_yi">π€ Twitter</a> β’ |
|
<a href="https://github.com/01-ai/Yi/issues/43#issuecomment-1827285245">π¬ WeChat</a> |
|
<br/> |
|
<a href="https://arxiv.org/abs/2403.04652">π Paper</a> β’ |
|
<a href="https://github.com/01-ai/Yi/tree/main?tab=readme-ov-file#faq">π FAQ</a> β’ |
|
<a href="https://github.com/01-ai/Yi/tree/main?tab=readme-ov-file#learning-hub">π Learning Hub</a> |
|
</p> |
|
|
|
# Intro |
|
|
|
Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples. |
|
|
|
Compared with Yi, Yi-1.5 delivers stronger performance in coding, math, reasoning, and instruction-following capability, while still maintaining excellent capabilities in language understanding, commonsense reasoning, and reading comprehension. |
|
|
|
<div align="center"> |
|
|
|
Model | Context Length | Pre-trained Tokens |
|
| :------------: | :------------: | :------------: | |
|
| Yi-1.5 | 4K | 3.6T |
|
|
|
</div> |
|
|
|
# Models |
|
|
|
- Chat models |
|
|
|
<div align="center"> |
|
|
|
Model | Download |
|
| :------------: | :------------: | |
|
Yi-1.5-34B-Chat | β’ π€ Hugging Face |
|
Yi-1.5-9B-Chat | β’ π€ Hugging Face |
|
Yi-1.5-6B-Chat | β’ π€ Hugging Face |
|
|
|
</div> |
|
|
|
- Base models |
|
|
|
<div align="center"> |
|
|
|
Model | Download |
|
| :------------: | :------------: | |
|
Yi-1.5-34B | β’ π€ Hugging Face |
|
Yi-1.5-9B | β’ π€ Hugging Face |
|
Yi-1.5-6B | β’ π€ Hugging Face |
|
|
|
</div> |
|
|
|
# Benchmarks |
|
|
|
- Chat models |
|
|
|
Waiting for benchmark results. |
|
|
|
|
|
- Base models |
|
|
|
- Yi-1.5-34B excels beyond or is on par with some larger models in overall performance. |
|
|
|
![image/png](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F656d9adce8bf55919aca7c3f%2FrVDOPiSerpCz_prrT4n5M.png%3C%2Fspan%3E)%3C!-- HTML_TAG_END --> |
|
|
|
- Yi-1.5-9B is a strong performer among similarly sized open-source models. |
|
|
|
| Model | MMLU | CMMLU | BBH | AGIEval | HumanEval(+) | MBPP(+) | GSM8k | Math | |
|
| -------------- | ---- | ----- | ---- | ------- | ------------ | ---------- | ----- | ----- | |
|
| Gemma-7B | 64.3 | 48.4 | 41.1 | 46.0 | 33.5(28.0) | 45.8(32.8) | 55.7 | 24.8 | |
|
| Qwen1.5-7B | 61.0 | 73.4 | 33.4 | 61.6 | 36.0(31.1) | 46.1(37.6) | 70.1 | 20.3 | |
|
| Mistral-7B | 62.5 | 44.6 | 45.0 | 42.4 | 29.3(22.6) | 50.2(32.1) | 47.5 | 15.5 | |
|
| Mistral 8\*7B | 70.6 | 53.0 | 52.4 | 49.5 | 40.2(31.1) | 60.7(31.1) | 65.7 | 28.4 | |
|
| Llama3-8B_Base | 66.6 | 50.9 | 47.9 | 44.7 | 34.7(31.7) | 48.0(44.9) | 54.7 | 21.16 | |
|
| Yi 1.5-6B | 63.5 | 70.8 | 45.7 | 56.0 | 36.5(28.7) | 56.8(46.9) | 62.2 | 28.42 | |
|
| Yi 1.5-9B | 69.5 | 74.8 | 50.9 | 62.7 | 41.4(34.1) | 61.1(53.6) | 73.7 | 32.6 | |
|
|
|
# Quick Start |
|
|
|
For getting up and running with Yi-1.5 models quickly, see [README](https://github.com/01-ai/Yi-1.5). |