---
base_model:
- meta-llama/Meta-Llama-3-8B
library_name: transformers
license: llama3
---

# Model Card for Llama-3-8B-Instruct-SkillMix
This model was SFT-ed from [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) with data generated by the Seed-Dataset Agnostic version of the Instruct-SkillMix pipeline.

## Training Details
We used 4000 examples from Instruct-SkillMix-SDA(k=2) (data available at [PrincetonPLI/Instruct-SkillMix-SDA](https://huggingface.co/datasets/PrincetonPLI/Instruct-SkillMix-SDA/blob/main/data/ism_sda_k2_4K.json)).

- LR: 2e-5
    - Linear Warmup Ratio: 0.03
    - Decay: Cosine Decay to 0
- Batch Size: 128
- epoch: 7 / 15
- Optimizer: AdamW
- Sequence Length: 1024

## Evaluation Details
We provide the set of generation configuration used for evaluation.

### AlpacaEval
- model_kwargs:
    - torch_dtype: 'bfloat16'
    - max_new_tokens: 2048
- temperature: 0.9
- top_p: 1.0
- do_sample: True
- stop_token_ids:
    - 128001
    - 128009

### MTBench
- model_kwargs:
    - torch_dtype: 'bfloat16'
    - max_new_tokens: 1024
- temperature: 0.7
- stop_token_ids:
    - 128001
    - 128009

### WildBench
- model_kwargs:
    - torch_dtype: 'bfloat16'
    - max_new_tokens: 4096
- temperature: 0.9
- top_p: 1.0
- do_sample: True
- stop_token_ids:
    - 128001
    - 128009

## Citation
Paper: [Instruct-SkillMix](https://www.arxiv.org/abs/2408.14774)
```
@misc{kaur2024instructskillmixpowerfulpipelinellm,
      title={Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning}, 
      author={Simran Kaur and Simon Park and Anirudh Goyal and Sanjeev Arora},
      year={2024},
      eprint={2408.14774},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2408.14774}, 
}
```

## Contact
Simran Kaur, Princeton University

Simon Park, Princeton University

{skaur, juhyunp} 'at' princeton 'dot' edu