Update README.md
Browse files
README.md
CHANGED
@@ -127,8 +127,9 @@ In this process, the CNT solar cells generate a tiny amount of power, but when t
|
|
127 |
```
|
128 |
|
129 |
## Usage with HuggingFace transformers
|
130 |
-
Model weights were converted
|
131 |
-
|
|
|
132 |
|
133 |
To speed up inference, we recommend installing mamba-ssm and flash attention 2.
|
134 |
|
|
|
127 |
```
|
128 |
|
129 |
## Usage with HuggingFace transformers
|
130 |
+
Model weights were converted from the original Mamba2 implementation to be Hugging Face compatible. <br>
|
131 |
+
Due to the lack of official support for Mamba2 attention layers in Hugging Face Transformers, custom modeling files are included. <br>
|
132 |
+
The attention layer implementation is based on the work from Pull Request #32027 in the Hugging Face Transformers repository: [https://github.com/huggingface/transformers/pull/32027](https://github.com/huggingface/transformers/pull/32027)
|
133 |
|
134 |
To speed up inference, we recommend installing mamba-ssm and flash attention 2.
|
135 |
|