File size: 2,326 Bytes
b2aed30 f723e45 b2aed30 f723e45 b2aed30 f723e45 b2aed30 f723e45 b2aed30 f723e45 b2aed30 f723e45 a9e8564 f723e45 b2aed30 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
---
license: other
license_name: mrl
license_link: https://mistral.ai/licenses/MRL-0.1.md
language:
- en
- fr
- de
- es
- it
- pt
- zh
- ja
- ru
- ko
pipeline_tag: text-generation
---
# Mistral-Large-218B-Instruct
![image/png](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F6604e5b21eb292d6df393365%2FP-BGJ5Ba2d1NkpdGXNThe.png%3C%2Fspan%3E)
Mistral-Large-218B-Instruct is a dense Large Language Model (LLM) with 218 billion parameters. Self-merged from the original Mistral Large 2.
## Key features
- 218 billion parameters
- Multi-lingual support for dozens of languages
- Trained on 80+ coding languages
- 128k context window
- Mistral Research License: Allows usage and modification for research and non-commercial purposes
## Hardware Requirements
Given the size of this model (218B parameters), it requires substantial computational resources for inference:
- Recommended: 8xH100 (640GB)
- Alternatively: Distributed inference setup across multiple machines
## Limitations
- No built-in moderation mechanisms
- Computationally expensive inference
- May exhibit biases present in training data
- Outputs should be critically evaluated for sensitive applications
## Notes
This was just a fun testing model, merged with the `merge.py` script in the base of the repo.
## Quants
GGUF: [mradermacher/Mistral-Large-218B-Instruct-GGUF](https://huggingface.co/mradermacher/Mistral-Large-218B-Instruct-GGUF)
imatrix GGUF: [mradermacher/Mistral-Large-218B-Instruct-i1-GGUF](https://huggingface.co/mradermacher/Mistral-Large-218B-Instruct-i1-GGUF)
Compatible `mergekit` config:
```yaml
slices:
- sources:
- layer_range: [0, 20]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [10, 30]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [20, 40]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [30, 50]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [40, 60]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [50, 70]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [60, 80]
model: mistralai/Mistral-Large-Instruct-2407
- sources:
- layer_range: [70, 87]
model: mistralai/Mistral-Large-Instruct-2407
merge_method: passthrough
dtype: bfloat16
``` |