Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper β’ 2412.13663 β’ Published 17 days ago β’ 116
RedPajama: an Open Dataset for Training Large Language Models Paper β’ 2411.12372 β’ Published Nov 19, 2024 β’ 47
FluidML: Fast and Memory Efficient Inference Optimization Paper β’ 2411.09242 β’ Published Nov 14, 2024 β’ 1
TΓLU 3: Pushing Frontiers in Open Language Model Post-Training Paper β’ 2411.15124 β’ Published Nov 22, 2024 β’ 57
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M β’ 15 items β’ Updated 12 days ago β’ 197
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy Sep 18, 2024 β’ 215
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 β’ 15 items β’ Updated 29 days ago β’ 551
DataGemma Release Collection A series of pioneering open models that help ground LLMs in real-world data through Data Commons. β’ 2 items β’ Updated 22 days ago β’ 82
Trained Models ποΈ Collection They may be small, but they're training like giants! β’ 8 items β’ Updated Dec 3, 2024 β’ 17
Minerva LLMs Collection The first family of LLMs pretrained from scratch on Italian. β’ 6 items β’ Updated 28 days ago β’ 32