Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 5 days ago • 41
MiniCPM Collection The MiniCPM family of LLMs and VLLMs. • 32 items • Updated about 10 hours ago • 59
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 10 days ago • 77
Phi-4 (All Versions) Collection Microsoft's new Phi-4 model in all formats. Includes GGUF, 4-bit bnb and original versions. Includes Unsloth's bug fixes. • 4 items • Updated 7 days ago • 29
2024 Interconnects Artifacts Collection Models & datasets mentioned in the bottom section of posts! • 280 items • Updated 17 days ago • 6
Evaluating Language Models as Synthetic Data Generators Paper • 2412.03679 • Published Dec 4, 2024 • 46
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated Dec 13, 2024 • 129
LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 20 items • Updated 4 days ago • 101
Llama 3.3 Collection This collection hosts the transformers and original repos of the Llama 3.3 • 1 item • Updated Dec 6, 2024 • 113
SmolVLM Collection State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct • 5 items • Updated 28 days ago • 31
view article Article Use Models from the Hugging Face Hub in LM Studio By yagilb • Nov 28, 2024 • 132
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated Nov 28, 2024 • 261
AMD-OLMo Collection AMD-OLMo are a series of 1 billion parameter language models trained by AMD on AMD Instinct™ MI250 GPUs based on OLMo. • 4 items • Updated Oct 31, 2024 • 18
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated 28 days ago • 204
view article Article Releasing Outlines-core 0.1.0: structured generation in Rust and Python Oct 22, 2024 • 44