--- base_model: - meta-llama/Meta-Llama-3-8B library_name: transformers license: llama3 --- # Model Card for Llama-3-8B-Instruct-SkillMix This model was SFT-ed from [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) with data generated by the Seed-Dataset Agnostic version of the Instruct-SkillMix pipeline. ## Training Details We used 4000 examples from Instruct-SkillMix-SDA(k=2) (data available at [PrincetonPLI/Instruct-SkillMix-SDA](https://huggingface.co/datasets/PrincetonPLI/Instruct-SkillMix-SDA/blob/main/data/ism_sda_k2_4K.json)). - LR: 2e-5 - Linear Warmup Ratio: 0.03 - Decay: Cosine Decay to 0 - Batch Size: 128 - epoch: 7 / 15 - Optimizer: AdamW - Sequence Length: 1024 ## Evaluation Details We provide the set of generation configuration used for evaluation. ### AlpacaEval - model_kwargs: - torch_dtype: 'bfloat16' - max_new_tokens: 2048 - temperature: 0.9 - top_p: 1.0 - do_sample: True - stop_token_ids: - 128001 - 128009 ### MTBench - model_kwargs: - torch_dtype: 'bfloat16' - max_new_tokens: 1024 - temperature: 0.7 - stop_token_ids: - 128001 - 128009 ### WildBench - model_kwargs: - torch_dtype: 'bfloat16' - max_new_tokens: 4096 - temperature: 0.9 - top_p: 1.0 - do_sample: True - stop_token_ids: - 128001 - 128009 ## Citation Paper: [Instruct-SkillMix](https://www.arxiv.org/abs/2408.14774) ``` @misc{kaur2024instructskillmixpowerfulpipelinellm, title={Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning}, author={Simran Kaur and Simon Park and Anirudh Goyal and Sanjeev Arora}, year={2024}, eprint={2408.14774}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2408.14774}, } ``` ## Contact Simran Kaur, Princeton University Simon Park, Princeton University {skaur, juhyunp} 'at' princeton 'dot' edu