Text Generation
Transformers
PyTorch
skywork_moe
custom_code
BBuf commited on
Commit
84f68dc
·
verified ·
1 Parent(s): 531028a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -31
README.md CHANGED
@@ -37,7 +37,7 @@ We introduce two innovative techniques: Gating Logit Normalization, which enhanc
37
  Skywork-MoE demonstrates comparable or superior performance to models with more parameters or more activated parameters, such as Grok-1, DBRX, Mistral 8*22, and Deepseek-V2.
38
 
39
  # News and Updates
40
- * 2024.6.3 We release the **Skywork-MoE-base** model.
41
 
42
  # Table of contents
43
 
@@ -49,22 +49,15 @@ Skywork-MoE demonstrates comparable or superior performance to models with more
49
  - [🤝Contact Us and Citation](#Contact-Us-and-Citation)
50
 
51
 
52
- # Download URL
53
-
54
- | | HuggingFace Model | ModelScope Model | Wisemodel Model |
55
- |:-------:|:-----------:|:-----------------------------:|:-----------------------------:|
56
- | **Skywork-MoE-base** | 🤗 [Skywork-MoE-base](https://github.com/SkyworkAI/Skywork-MoE) | 🤖[Skywork-MoE-base](https://www.modelscope.cn/models/skywork/Skywork-MoE-base) | 👾[Skywork-MoE-base](https://wisemodel.cn/models/Skywork/Skywork-MoE-base) |
57
- | **Skywork-MoE-Base-FP8** | 🤗 [Skywork-MoE-Base-FP8](https://github.com/SkyworkAI/Skywork-MoE) | 🤖 | 👾 |
58
-
59
  # Benchmark Results
60
- We evaluated Skywork-MoE-base model on various popular benchmarks, including C-Eval, MMLU, CMMLU, GSM8K, MATH and HumanEval.
61
  <img src="misc/skywork_moe_base_evaluation.png" alt="Image" width="600" height="280">
62
 
63
  # Demonstration of Hugging Face Model Inference
64
 
65
  ## Base Model Inference
66
 
67
- We can perform inference for the Skywork-MoE-base (16x13B size) model using HuggingFace on 8xA100/A800 or higher GPU hardware configurations.
68
 
69
  ```python
70
 
@@ -100,35 +93,23 @@ comming soon...
100
 
101
  ## Quickstart with vLLM
102
 
103
- We provide a method to quickly deploy the Skywork-Moe-base model based on vllm.
104
-
105
- Under fp8 precision you can run Skywork-Moe-base with just only 8*4090.
106
 
107
  You can get the source code in [`vllm`](https://github.com/SkyworkAI/vllm)
108
 
109
- You can get the fp8 model in [`Skywork-MoE-Base-FP8`](https://huggingface.co/Skywork/Skywork-MoE-Base-FP8)
110
 
111
  ### Based on local environment
112
 
113
- Since pytorch only supports 4090 using fp8 precision in the nightly version, you need to install the corresponding or newer version of pytorch.
114
-
115
- ``` shell
116
- # for cuda12.1
117
- pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121
118
- # for cuda12.4
119
- pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu124
120
- ```
121
-
122
- Some other dependencies also need to be installed:
123
 
124
  ```shell
125
  pip3 install xformers vllm-flash-attn
126
  ```
127
 
128
- Then clone the [`vllm`](https://github.com/SkyworkAI/vllm) provided by skywork and change to `skywork-moe` branch:
129
 
130
  ``` shell
131
- git clone https://github.com/SkyworkAI/vllm.git -b skywork-moe
132
  cd vllm
133
  ```
134
 
@@ -138,7 +119,7 @@ Then compile and install vllm:
138
  MAX_JOBS=8 python3 setup.py install
139
  ```
140
 
141
- ### Base on docker
142
 
143
  You can use the docker image provided by skywork to run vllm directly:
144
 
@@ -149,7 +130,7 @@ docker pull registry.cn-wulanchabu.aliyuncs.com/triple-mu/skywork-moe-vllm:v1
149
  Then start the container and set the model path and working directory.
150
 
151
  ```shell
152
- model_path="Skywork/Skywork-MoE-Base-FP8"
153
  workspace=${PWD}
154
 
155
  docker run \
@@ -162,19 +143,19 @@ docker run \
162
  --privileged=true \
163
  --ulimit stack=67108864 \
164
  --ipc=host \
165
- -v ${model_path}:/Skywork-MoE-Base-FP8 \
166
  -v ${workspace}:/workspace \
167
  registry.cn-wulanchabu.aliyuncs.com/triple-mu/skywork-moe-vllm:v1
168
  ```
169
 
170
- Now, you can run the Skywork Moe base model for fun!
171
 
172
  ### Text Completion
173
 
174
  ``` python
175
  from vllm import LLM, SamplingParams
176
 
177
- model_path = '/path/to/skywork-moe-base'
178
  prompts = [
179
  "The president of the United States is",
180
  "The capital of France is",
 
37
  Skywork-MoE demonstrates comparable or superior performance to models with more parameters or more activated parameters, such as Grok-1, DBRX, Mistral 8*22, and Deepseek-V2.
38
 
39
  # News and Updates
40
+ * 2024.6.3 We release the **Skywork-MoE-Base** model.
41
 
42
  # Table of contents
43
 
 
49
  - [🤝Contact Us and Citation](#Contact-Us-and-Citation)
50
 
51
 
 
 
 
 
 
 
 
52
  # Benchmark Results
53
+ We evaluated Skywork-MoE-Base model on various popular benchmarks, including C-Eval, MMLU, CMMLU, GSM8K, MATH and HumanEval.
54
  <img src="misc/skywork_moe_base_evaluation.png" alt="Image" width="600" height="280">
55
 
56
  # Demonstration of Hugging Face Model Inference
57
 
58
  ## Base Model Inference
59
 
60
+ We can perform inference for the Skywork-MoE-Base (16x13B size) model using HuggingFace on 8xA100/A800 or higher GPU hardware configurations.
61
 
62
  ```python
63
 
 
93
 
94
  ## Quickstart with vLLM
95
 
96
+ We provide a method to quickly deploy the Skywork-MoE-Base model based on vllm.
 
 
97
 
98
  You can get the source code in [`vllm`](https://github.com/SkyworkAI/vllm)
99
 
 
100
 
101
  ### Based on local environment
102
 
103
+ Some dependencies need to be installed:
 
 
 
 
 
 
 
 
 
104
 
105
  ```shell
106
  pip3 install xformers vllm-flash-attn
107
  ```
108
 
109
+ Then clone the [`vllm`](https://github.com/SkyworkAI/vllm) provided by skywork:
110
 
111
  ``` shell
112
+ git clone https://github.com/SkyworkAI/vllm.git
113
  cd vllm
114
  ```
115
 
 
119
  MAX_JOBS=8 python3 setup.py install
120
  ```
121
 
122
+ ### Based on docker
123
 
124
  You can use the docker image provided by skywork to run vllm directly:
125
 
 
130
  Then start the container and set the model path and working directory.
131
 
132
  ```shell
133
+ model_path="Skywork/Skywork-MoE-Base"
134
  workspace=${PWD}
135
 
136
  docker run \
 
143
  --privileged=true \
144
  --ulimit stack=67108864 \
145
  --ipc=host \
146
+ -v ${model_path}:/Skywork-MoE-Base \
147
  -v ${workspace}:/workspace \
148
  registry.cn-wulanchabu.aliyuncs.com/triple-mu/skywork-moe-vllm:v1
149
  ```
150
 
151
+ Now, you can run the Skywork-MoE-Base model for fun!
152
 
153
  ### Text Completion
154
 
155
  ``` python
156
  from vllm import LLM, SamplingParams
157
 
158
+ model_path = 'Skywork/Skywork-MoE-Base'
159
  prompts = [
160
  "The president of the United States is",
161
  "The capital of France is",