AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 70
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 15 days ago • 293
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 225
Qwen2-Audio Collection Audio-language model series based on Qwen2 • 4 items • Updated Nov 28, 2024 • 49
Parler-TTS: fully open-source high-quality TTS Collection If you want to find out more about how these models were trained and even fine-tune them yourself, check-out the Parler-TTS repository on GitHub. • 8 items • Updated Dec 2, 2024 • 49
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated 4 days ago • 60
Chameleon Collection Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR. • 2 items • Updated Jul 9, 2024 • 28
xLAM models Collection xLAM: A Family of Large Action Models to Empower AI Agent Systems: https://github.com/SalesforceAIResearch/xLAM • 11 items • Updated Dec 19, 2024 • 45
SEED-Story: Multimodal Long Story Generation with Large Language Model Paper • 2407.08683 • Published Jul 11, 2024 • 22
4M Models Collection Multimodal models from https://4m.epfl.ch/ • 14 items • Updated Jun 14, 2024 • 31
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6, 2024 • 91
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models Paper • 2404.07839 • Published Apr 11, 2024 • 44
CameraCtrl: Enabling Camera Control for Text-to-Video Generation Paper • 2404.02101 • Published Apr 2, 2024 • 22
Awesome Document AI Collection A collection of open-source document AI 📄 📝 📈 • 27 items • Updated Mar 11, 2024 • 76