LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 17 items • Updated 9 days ago • 91
⛈️ Llama-3.1 Storm Models Collection Fine-tuned Llama 3.1 8B model with superior reasoning, conversation abilities, and function calling! • 3 items • Updated Aug 25, 2024 • 15
Tulu V2.5 Suite Collection A suite of models trained using DPO and PPO across a wide variety (up to 14) of preference datasets. See https://arxiv.org/abs/2406.09279 for more! • 44 items • Updated Nov 27, 2024 • 14
4M Models Collection Multimodal models from https://4m.epfl.ch/ • 14 items • Updated Jun 14, 2024 • 30
Magpie-Pro Datasets (Llama-3) Collection Dataset built with Meta Llama 3 70B. Models are fine-tuned from Llama 3 8B. • 6 items • Updated Sep 20, 2024 • 16
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated 19 days ago • 142
Comparing DPO with IPO and KTO Collection A collection of chat models to explore the differences between three alignment techniques: DPO, IPO, and KTO. • 56 items • Updated Jan 9, 2024 • 31
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 257
Awesome feedback datasets Collection A curated list of datasets with human or AI feedback. Useful for training reward models or applying techniques like DPO. • 19 items • Updated Apr 12, 2024 • 66
Awesome SFT datasets Collection A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12, 2024 • 124