Crystalcareai
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,9 +1,10 @@
|
|
1 |
This is an MOE of Llama-3-8b with 4 experts. This does not use semantic routing, as this utilizes the deepseek-moe architecture. There is no routing, and there is no gate - all experts are active on every token.
|
2 |
|
3 |
-
```
|
|
|
4 |
from transformers import AutoTokenizer, TextStreamer, AutoModelForCausalLM
|
5 |
|
6 |
-
model_path = "
|
7 |
model = AutoModelForCausalLM.from_pretrained(
|
8 |
model_path,
|
9 |
device_map="auto",
|
|
|
1 |
This is an MOE of Llama-3-8b with 4 experts. This does not use semantic routing, as this utilizes the deepseek-moe architecture. There is no routing, and there is no gate - all experts are active on every token.
|
2 |
|
3 |
+
```python
|
4 |
+
import torch
|
5 |
from transformers import AutoTokenizer, TextStreamer, AutoModelForCausalLM
|
6 |
|
7 |
+
model_path = "Crystalcareai/llama-3-4x8b"
|
8 |
model = AutoModelForCausalLM.from_pretrained(
|
9 |
model_path,
|
10 |
device_map="auto",
|