MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages Paper ā¢ 2410.01036 ā¢ Published Oct 1, 2024 ā¢ 14
HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors Paper ā¢ 2408.06019 ā¢ Published Aug 12, 2024 ā¢ 13
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction Paper ā¢ 2409.18124 ā¢ Published Sep 26, 2024 ā¢ 32
Llama 3.2 Collection Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. ā¢ 23 items ā¢ Updated 12 days ago ā¢ 46
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy Sep 18, 2024 ā¢ 215
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. ā¢ 45 items ā¢ Updated Nov 28, 2024 ā¢ 453
ReMamba: Equip Mamba with Effective Long-Sequence Modeling Paper ā¢ 2408.15496 ā¢ Published Aug 28, 2024 ā¢ 10
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper ā¢ 2408.15237 ā¢ Published Aug 27, 2024 ā¢ 38
The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design Paper ā¢ 2408.12503 ā¢ Published Aug 22, 2024 ā¢ 23
Controllable Text Generation for Large Language Models: A Survey Paper ā¢ 2408.12599 ā¢ Published Aug 22, 2024 ā¢ 64
Jamba-1.5 Collection The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models ā¢ 2 items ā¢ Updated Aug 22, 2024 ā¢ 83
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 ā¢ Aug 19, 2024 ā¢ 75
Transformer Language Models without Positional Encodings Still Learn Positional Information Paper ā¢ 2203.16634 ā¢ Published Mar 30, 2022 ā¢ 5
Qwen2-Audio Collection Audio-language model series based on Qwen2 ā¢ 4 items ā¢ Updated Nov 28, 2024 ā¢ 49