File size: 6,468 Bytes
9e7d616
7cae0ce
 
 
9e7d616
7cae0ce
 
 
837e93a
7cae0ce
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19fab33
7cae0ce
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6e2b03a
7cae0ce
6e2b03a
 
f3d1360
6e2b03a
f3d1360
6e2b03a
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
language:
- zh
- en
---
# BlueLM

<p align="center">
🖥 <a href="https://github.com/vivo-ai-lab/BlueLM" target="_blank">github</a>  • 📜 <a href="https://huggingface.co/vivo-ai/BlueLM-7B-Chat-32K-GPTQ/blob/main/OpenAtom%20Model%20License.pdf" target="_blank">LICENSE</a> • 🎯 <a href="https://developers.vivo.com/product/ai/bluelm" target="_blank">vivo Developers</a> • 🗨 <a href="https://github.com/vivo-ai-lab/BlueLM/blob/main/resources/wechat.png" target="_blank">WeChat</a>
</p>

## 模型介绍/Introduction

BlueLM 是由 vivo AI 全球研究院自主研发的大规模预训练语言模型,本次发布包含 7B 基础模型和 7B 对话模型,同时我们开源了支持 **32K** 的长文本基础模型和对话模型。

- **更大量的优质数据**:高质量语料库进行训练,规模达到了 **2.6 万亿** 的 token 数,该语料库包含中文、英文以及少量日韩数据。
- **更优的效果**:其中 BlueLM-7B-Chat 在 **C-Eval****CMMLU** 上均取得领先结果,对比同尺寸开源模型中具有较强的竞争力。
- **长文本支持**:BlueLM-7B-Base-32K 和 BlueLM-7B-Chat-32K 均支持 **32K** 长文本,在保持基础能力相当情况下,能够支持更长上下文理解。
- **协议说明**:BlueLM 系列欢迎开发者进行学术研究和商业应用。

BlueLM is a large-scale open-source language model independently developed by the vivo AI Lab. This release includes 2K and 32K context length versions for both Base and Chat models.

- **High-quality Data**: BlueLM is trained on a high-quality data with 2.6 trillion tokens. Our train corpus mainly consists of Chinese and English data, with a small amount of Japanese and Korean data.
- **Stronger Performance**: BlueLM-7B-Chat achieves a strong competitive performance in C-Eval and CMMLU benchmarks of the same size.
- **Longer Context**: We have extended the context length of both BlueLM-7B-Base-32K and BlueLM-7B-Chat-32K models from 2K to 32K. The models can support longer context understanding while maintaining the same basic capabilities.
- **Model License**: BlueLM weights are open for academic research and commercial use. 

本次发布基座模型下载链接见:

The release versions and hugging face download links are listed in the table below:

|     |          Base Model        |          Chat Model        |       4bits Quantized Chat Model        |
|:---:|:--------------------:|:--------------------:|:--------------------------:|
| 7B-2k  | [BlueLM-7B-Base](https://huggingface.co/vivo-ai/BlueLM-7B-Base)  | [BlueLM-7B-Chat](https://huggingface.co/vivo-ai/BlueLM-7B-Chat)  | [BlueLM-7B-Chat-4bits](https://huggingface.co/vivo-ai/BlueLM-7B-Chat-4bits)  |
| 7B-32K | [BlueLM-7B-Base-32K](https://huggingface.co/vivo-ai/BlueLM-7B-Base-32K) | [BlueLM-7B-Chat-32K](https://huggingface.co/vivo-ai/BlueLM-7B-Chat-32K) | [BlueLM-7B-Chat-32K-AWQ](https://huggingface.co/vivo-ai/BlueLM-7B-Chat-32K-AWQ) / [BlueLM-7B-Chat-32K-GPTQ](https://huggingface.co/vivo-ai/BlueLM-7B-Chat-32K-GPTQ) |

## 评测结果/Benchmark Results

我们在 LongBench 评测集上对我们的 BlueLM-7B-Chat-32K 模型进行了测试,具体结果如下表所示:

We tested our BlueLM-7B-Chat-32K  on the LongBench dataset and the results are shown in the table below:

| Model                 | Average   | Summary  | Single-Doc QA | Multi-Doc QA  | Code  | Few-shot | Synthetic |
|:----------------------|:-----|:---------|:--------------|:--------------|:------|:---------|:----------|
| BlueLM-7B-Chat-32K    | 41.2 | 18.8     | 35.6          | 36.2          | 54.2  | 56.9     | 45.5      |

## 推理部署/Inference and Deployment

```python
>>> import torch
>>> from transformers import AutoModelForCausalLM, AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("vivo-ai/BlueLM-7B-Chat-32K-GPTQ", trust_remote_code=True, use_fast=False)
>>> model = AutoModelForCausalLM.from_pretrained("vivo-ai/BlueLM-7B-Chat-32K-GPTQ", device_map="cuda:0", torch_dtype=torch.float16, trust_remote_code=True, low_cpu_mem_usage=True, use_cache=False)
>>> model = model.eval()
>>> inputs = tokenizer("[|Human|]:写一篇关于刘慈欣《三体》小说的读后感,1000字左右[|AI|]:", return_tensors="pt")
>>> inputs = inputs.to("cuda:0")
>>> pred = model.generate(**inputs, max_new_tokens=2048, repetition_penalty=1.1)
>>> print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
```

更多使用说明,请参考我们的 [Github 仓库](https://github.com/vivo-ai-lab/BlueLM)。

For more instructions, please refer to our [Github Repo](https://github.com/vivo-ai-lab/BlueLM).

## 协议/License

为了使本项目更加开放、灵活,服务更多开发者与用户,自2024年12月25日起,本项目的大模型开源许可证进行了一次重要更新,由 [原vivo_BlueLM模型许可协议](https://huggingface.co/vivo-ai/BlueLM-7B-Chat-32K-GPTQ/blob/main/MODEL_LICENSE) 变更为 [开放原子模型许可证](https://huggingface.co/vivo-ai/BlueLM-7B-Chat-32K-GPTQ/blob/main/OpenAtom%20Model%20License.pdf)。

To make this project more open and flexible, serving more developers and users, starting from December 25, 2024, there will be a significant update to the open-source license of the large model for this project. It will change from the [Community License for BlueLM Model](https://huggingface.co/vivo-ai/BlueLM-7B-Chat-32K-GPTQ/blob/main/MODEL_LICENSE) to the [OpenAtom Model License](https://huggingface.co/vivo-ai/BlueLM-7B-Chat-32K-GPTQ/blob/main/OpenAtom%20Model%20License.pdf).

基于全新的大模型开源许可证,使用者可以在更少的限制下使用、修改和分发本项目的大模型。请确保您阅读并理解新的 [许可证内容](https://huggingface.co/vivo-ai/BlueLM-7B-Chat-32K-GPTQ/blob/main/OpenAtom%20Model%20License.pdf)。我们欢迎任何对这一变化的反馈,您可以通过邮件([email protected])与我们联系。

Based on the newly introduced open-source license for the large model, users can use, modify, and distribute this project's large model with fewer restrictions. Please ensure that you read and understand the new [license](https://huggingface.co/vivo-ai/BlueLM-7B-Chat-32K-GPTQ/blob/main/OpenAtom%20Model%20License.pdf). We welcome any feedback regarding this change, and you can contact us via email ([email protected]).

感谢您对本项目的支持!

Thank you for your support of this project!