Mamba-In-Zephyr
Collection
Mamba distilled from Zephyr. The Mamba in the Llama: Distilling and Accelerating Hybrid Models (https://arxiv.org/abs/2408.15237).
•
7 items
•
Updated
This model is a fine-tuned version of JunxiongWang/mamba_0_875_sft on the HuggingFaceH4/ultrafeedback_binarized dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
---|---|---|---|---|---|---|---|---|---|---|---|
0.1219 | 1.0466 | 2000 | 0.5598 | -1.2751 | -2.5954 | 0.7539 | 1.3204 | -295.7982 | -280.0076 | -2.6264 | -2.6813 |
0.0099 | 2.0931 | 4000 | 0.6922 | -3.9752 | -6.3998 | 0.7852 | 2.4245 | -333.8416 | -307.0094 | -2.4971 | -2.5509 |
@article{junxiongdaniele2024mambainllama,
title = {The Mamba in the Llama: Distilling and Accelerating Hybrid Models},
author = {Junxiong Wang and Daniele Paliotta and Avner May and Alexander M. Rush and Tri Dao},
journal = {arXiv preprint arXiv:2408.15237},
year = {2024}
}
Base model
JunxiongWang/mamba_0_875_sft