Simon Park
Updated Model Card
b809eb5
---
base_model:
- meta-llama/Meta-Llama-3-8B
library_name: transformers
license: llama3
---
# Model Card for Llama-3-8B-Instruct-SkillMix
This model was SFT-ed from [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) with data generated by the Seed-Dataset Agnostic version of the Instruct-SkillMix pipeline.
## Training Details
We used 4000 examples from Instruct-SkillMix-SDA(k=2) (data available at [PrincetonPLI/Instruct-SkillMix-SDA](https://huggingface.co/datasets/PrincetonPLI/Instruct-SkillMix-SDA/blob/main/data/ism_sda_k2_4K.json)).
- LR: 2e-5
- Linear Warmup Ratio: 0.03
- Decay: Cosine Decay to 0
- Batch Size: 128
- epoch: 7 / 15
- Optimizer: AdamW
- Sequence Length: 1024
## Evaluation Details
We provide the set of generation configuration used for evaluation.
### AlpacaEval
- model_kwargs:
- torch_dtype: 'bfloat16'
- max_new_tokens: 2048
- temperature: 0.9
- top_p: 1.0
- do_sample: True
- stop_token_ids:
- 128001
- 128009
### MTBench
- model_kwargs:
- torch_dtype: 'bfloat16'
- max_new_tokens: 1024
- temperature: 0.7
- stop_token_ids:
- 128001
- 128009
### WildBench
- model_kwargs:
- torch_dtype: 'bfloat16'
- max_new_tokens: 4096
- temperature: 0.9
- top_p: 1.0
- do_sample: True
- stop_token_ids:
- 128001
- 128009
## Citation
Paper: [Instruct-SkillMix](https://www.arxiv.org/abs/2408.14774)
```
@misc{kaur2024instructskillmixpowerfulpipelinellm,
title={Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning},
author={Simran Kaur and Simon Park and Anirudh Goyal and Sanjeev Arora},
year={2024},
eprint={2408.14774},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2408.14774},
}
```
## Contact
Simran Kaur, Princeton University
Simon Park, Princeton University
{skaur, juhyunp} 'at' princeton 'dot' edu