view article Article Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK By davidberenstein1957 • Nov 21, 2024 • 35
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais • Nov 13, 2024 • 98
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 101
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28, 2024 • 353
📀 Dataset comparison models Collection 1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12, 2024 • 35
Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck Paper • 2404.07647 • Published Apr 11, 2024 • 4
OpenCulture Collection A multilingual dataset of public domain books and newspapers. • 27 items • Updated Nov 6, 2024 • 121
TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese Paper • 2401.16640 • Published Jan 30, 2024 • 8
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 221
Apple MLX-compatible 7B LLMs on the 🤗 Hub Collection This collection contains the model weights for 7B LLMs for Apple's MLX framework. Find more information at https://github.com/ml-explore/mlx • 8 items • Updated Sep 2, 2024 • 9