Transformers
PyTorch
Inference Endpoints
File size: 1,168 Bytes
ca597f0
 
 
8219bfb
 
 
 
 
 
 
621d8c2
8219bfb
 
 
 
521a777
8219bfb
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
---
license: apache-2.0
---

# BlackMamba
<img src="/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F65bc13717c6ad1994b6619e9%2FJdxNtwFrmEAnjJ0_MP5A3.jpeg%26quot%3B%3C%2Fspan%3E width="900" height="900" />


> **BlackMamba: Mixture of Experts for State-space models**\
> Quentin Anthony*, Yury Tokpanov*, Paolo Glorioso*, Beren Millidge*\
> Paper: https://arxiv.org/abs/2402.01771

<img src="/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F65bc13717c6ad1994b6619e9%2FaHpEc5tnCJShO2Kn0f637.png%26quot%3B%3C%2Fspan%3E width="900" height="900" />

## About
We provide inference code for our BlackMamba model in our github repository: https://github.com/Zyphra/BlackMamba

BlackMamba is an novel architecture which combines state-space models (SSMs) with mixture of experts (MoE). It uses [Mamba](https://arxiv.org/abs/2312.00752) as its SSM block and [switch transformer](https://arxiv.org/abs/2101.03961) as its MoE block base. BlackMamba is extremely low latency for generation and inference, providing significant speedups over all of classical transformers, MoEs, and Mamba SSM models. Additionally, due to its SSM sequence mixer, BlackMamba retains linear compuational complexity in the sequence length.