Qwen
/

JustinLin610 commited on
Commit
df4768d
·
1 Parent(s): 779ebe3

Update README.md (#44)

Browse files

- Update README.md (ff975d7488617cf537e19589013615547ea2c9b7)

Files changed (1) hide show
  1. README.md +26 -11
README.md CHANGED
@@ -16,7 +16,7 @@ inference: false
16
  <br>
17
 
18
  <p align="center">
19
- 🤗 <a href="https://huggingface.co/Qwen">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/models/qwen">ModelScope</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://arxiv.org/abs/2309.16609">Paper</a>&nbsp&nbsp | &nbsp&nbsp🖥️ <a href="https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary">Demo</a>
20
  <br>
21
  <a href="https://github.com/QwenLM/Qwen/blob/main/assets/wechat.png">WeChat (微信)</a>&nbsp&nbsp | &nbsp&nbsp DingTalk (钉钉) &nbsp&nbsp | &nbsp&nbsp<a href="https://discord.gg/z3GAxXZ9Ce">Discord</a>&nbsp&nbsp
22
  </p>
@@ -27,11 +27,11 @@ inference: false
27
 
28
  **通义千问-7B(Qwen-7B)**是阿里云研发的通义千问大模型系列的70亿参数规模的模型。Qwen-7B是基于Transformer的大语言模型, 在超大规模的预训练数据上进行训练得到。预训练数据类型多样,覆盖广泛,包括大量网络文本、专业书籍、代码等。同时,在Qwen-7B的基础上,我们使用对齐机制打造了基于大语言模型的AI助手Qwen-7B-Chat。相较于最初开源的Qwen-7B模型,我们现已将预训练模型和Chat模型更新到效果更优的版本。本仓库为Qwen-7B-Chat的仓库。
29
 
30
- 如果您想了解更多关于通义千问-7B开源模型的细节,我们建议您参阅[Github代码库](https://github.com/QwenLM/Qwen)。
31
 
32
  **Qwen-7B** is the 7B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-7B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the pretrained Qwen-7B, we release Qwen-7B-Chat, a large-model-based AI assistant, which is trained with alignment techniques. Now we have updated both our pretrained and chat models with better performances. This repository is the one for Qwen-7B-Chat.
33
 
34
- For more details about Qwen, please refer to the [Github](https://github.com/QwenLM/Qwen) code repository.
35
  <br>
36
 
37
  ## 要求(Requirements)
@@ -54,15 +54,14 @@ To run Qwen-7B-Chat, please make sure you meet the above requirements, and then
54
  pip install transformers==4.32.0 accelerate tiktoken einops scipy transformers_stream_generator==0.0.4 peft deepspeed
55
  ```
56
 
57
- 另外,推荐安装`flash-attention`库,以实现更高的效率和更低的显存占用。
58
 
59
- In addition, it is recommended to install the `flash-attention` library for higher efficiency and lower memory usage.
60
 
61
  ```bash
62
- git clone -b v1.0.8 https://github.com/Dao-AILab/flash-attention
63
  cd flash-attention && pip install .
64
  # 下方安装可选,安装可能比较缓慢。
65
- # Below are optional. Installing them might be slow.
66
  # pip install csrc/layer_norm
67
  # pip install csrc/rotary
68
  ```
@@ -90,8 +89,8 @@ tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B-Chat", trust_remote_code
90
  # use auto mode, automatically select precision based on the device.
91
  model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-7B-Chat", device_map="auto", trust_remote_code=True).eval()
92
 
93
- # Specify hyperparameters for generation
94
- model.generation_config = GenerationConfig.from_pretrained("Qwen/Qwen-7B-Chat", trust_remote_code=True) # 可指定不同的生成长度、top_p等相关超参
95
 
96
  # 第一轮对话 1st dialogue turn
97
  response, history = model.chat(tokenizer, "你好", history=None)
@@ -114,9 +113,9 @@ print(response)
114
  # 《奋斗创业:一个年轻人的成功之路》
115
  ```
116
 
117
- 关于更多的使用说明,请参考我们的[Github repo](https://github.com/QwenLM/Qwen)获取更多信息。
118
 
119
- For more information, please refer to our [Github repo](https://github.com/QwenLM/Qwen) for more information.
120
  <br>
121
 
122
  ## Tokenizer
@@ -613,6 +612,22 @@ Qwen-Chat also has the capability to be used as a [HuggingFace Agent](https://hu
613
  If you meet problems, please refer to [FAQ](https://github.com/QwenLM/Qwen/blob/main/FAQ.md) and the issues first to search a solution before you launch a new issue.
614
  <br>
615
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
616
  ## 使用协议(License Agreement)
617
 
618
  我们的代码和模型权重对学术研究完全开放,并支持商用。请查看[LICENSE](https://github.com/QwenLM/Qwen/blob/main/LICENSE)了解具体的开源协议细节。如需商用,请填写[问卷](https://dashscope.console.aliyun.com/openModelApply/qianwen)申请。
 
16
  <br>
17
 
18
  <p align="center">
19
+ 🤗 <a href="https://huggingface.co/Qwen">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/organization/qwen">ModelScope</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://arxiv.org/abs/2309.16609">Paper</a>&nbsp&nbsp | &nbsp&nbsp🖥️ <a href="https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary">Demo</a>
20
  <br>
21
  <a href="https://github.com/QwenLM/Qwen/blob/main/assets/wechat.png">WeChat (微信)</a>&nbsp&nbsp | &nbsp&nbsp DingTalk (钉钉) &nbsp&nbsp | &nbsp&nbsp<a href="https://discord.gg/z3GAxXZ9Ce">Discord</a>&nbsp&nbsp
22
  </p>
 
27
 
28
  **通义千问-7B(Qwen-7B)**是阿里云研发的通义千问大模型系列的70亿参数规模的模型。Qwen-7B是基于Transformer的大语言模型, 在超大规模的预训练数据上进行训练得到。预训练数据类型多样,覆盖广泛,包括大量网络文本、专业书籍、代码等。同时,在Qwen-7B的基础上,我们使用对齐机制打造了基于大语言模型的AI助手Qwen-7B-Chat。相较于最初开源的Qwen-7B模型,我们现已将预训练模型和Chat模型更新到效果更优的版本。本仓库为Qwen-7B-Chat的仓库。
29
 
30
+ 如果您想了解更多关于通义千问-7B开源模型的细节,我们建议您参阅[GitHub代码库](https://github.com/QwenLM/Qwen)。
31
 
32
  **Qwen-7B** is the 7B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-7B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the pretrained Qwen-7B, we release Qwen-7B-Chat, a large-model-based AI assistant, which is trained with alignment techniques. Now we have updated both our pretrained and chat models with better performances. This repository is the one for Qwen-7B-Chat.
33
 
34
+ For more details about Qwen, please refer to the [GitHub](https://github.com/QwenLM/Qwen) code repository.
35
  <br>
36
 
37
  ## 要求(Requirements)
 
54
  pip install transformers==4.32.0 accelerate tiktoken einops scipy transformers_stream_generator==0.0.4 peft deepspeed
55
  ```
56
 
57
+ 另外,推荐安装`flash-attention`库(**当前已支持flash attention 2**),以实现更高的效率和更低的显存占用。
58
 
59
+ In addition, it is recommended to install the `flash-attention` library (**we support flash attention 2 now.**) for higher efficiency and lower memory usage.
60
 
61
  ```bash
62
+ git clone https://github.com/Dao-AILab/flash-attention
63
  cd flash-attention && pip install .
64
  # 下方安装可选,安装可能比较缓慢。
 
65
  # pip install csrc/layer_norm
66
  # pip install csrc/rotary
67
  ```
 
89
  # use auto mode, automatically select precision based on the device.
90
  model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-7B-Chat", device_map="auto", trust_remote_code=True).eval()
91
 
92
+ # Specify hyperparameters for generation. But if you use transformers>=4.32.0, there is no need to do this.
93
+ # model.generation_config = GenerationConfig.from_pretrained("Qwen/Qwen-7B-Chat", trust_remote_code=True) # 可指定不同的生成长度、top_p等相关超参
94
 
95
  # 第一轮对话 1st dialogue turn
96
  response, history = model.chat(tokenizer, "你好", history=None)
 
113
  # 《奋斗创业:一个年轻人的成功之路》
114
  ```
115
 
116
+ 关于更多的使用说明,请参考我们的[GitHub repo](https://github.com/QwenLM/Qwen)获取更多信息。
117
 
118
+ For more information, please refer to our [GitHub repo](https://github.com/QwenLM/Qwen) for more information.
119
  <br>
120
 
121
  ## Tokenizer
 
612
  If you meet problems, please refer to [FAQ](https://github.com/QwenLM/Qwen/blob/main/FAQ.md) and the issues first to search a solution before you launch a new issue.
613
  <br>
614
 
615
+ ## 引用 (Citation)
616
+
617
+ 如果你觉得我们的工作对你有帮助,欢迎引用!
618
+
619
+ If you find our work helpful, feel free to give us a cite.
620
+
621
+ ```
622
+ @article{qwen,
623
+ title={Qwen Technical Report},
624
+ author={Jinze Bai and Shuai Bai and Yunfei Chu and Zeyu Cui and Kai Dang and Xiaodong Deng and Yang Fan and Wenbin Ge and Yu Han and Fei Huang and Binyuan Hui and Luo Ji and Mei Li and Junyang Lin and Runji Lin and Dayiheng Liu and Gao Liu and Chengqiang Lu and Keming Lu and Jianxin Ma and Rui Men and Xingzhang Ren and Xuancheng Ren and Chuanqi Tan and Sinan Tan and Jianhong Tu and Peng Wang and Shijie Wang and Wei Wang and Shengguang Wu and Benfeng Xu and Jin Xu and An Yang and Hao Yang and Jian Yang and Shusheng Yang and Yang Yao and Bowen Yu and Hongyi Yuan and Zheng Yuan and Jianwei Zhang and Xingxuan Zhang and Yichang Zhang and Zhenru Zhang and Chang Zhou and Jingren Zhou and Xiaohuan Zhou and Tianhang Zhu},
625
+ journal={arXiv preprint arXiv:2309.16609},
626
+ year={2023}
627
+ }
628
+ ```
629
+ <br>
630
+
631
  ## 使用协议(License Agreement)
632
 
633
  我们的代码和模型权重对学术研究完全开放,并支持商用。请查看[LICENSE](https://github.com/QwenLM/Qwen/blob/main/LICENSE)了解具体的开源协议细节。如需商用,请填写[问卷](https://dashscope.console.aliyun.com/openModelApply/qianwen)申请。