-
Gemini: A Family of Highly Capable Multimodal Models
Paper ā¢ 2312.11805 ā¢ Published ā¢ 45 -
VCoder: Versatile Vision Encoders for Multimodal Large Language Models
Paper ā¢ 2312.14233 ā¢ Published ā¢ 16 -
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities
Paper ā¢ 2405.18669 ā¢ Published ā¢ 11
Collections
Discover the best community collections!
Collections including paper arxiv:2312.11805
-
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
Paper ā¢ 2312.02087 ā¢ Published ā¢ 21 -
FaceStudio: Put Your Face Everywhere in Seconds
Paper ā¢ 2312.02663 ā¢ Published ā¢ 31 -
Orthogonal Adaptation for Modular Customization of Diffusion Models
Paper ā¢ 2312.02432 ā¢ Published ā¢ 13 -
ReconFusion: 3D Reconstruction with Diffusion Priors
Paper ā¢ 2312.02981 ā¢ Published ā¢ 9
-
Attention Is All You Need
Paper ā¢ 1706.03762 ā¢ Published ā¢ 50 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper ā¢ 2307.08691 ā¢ Published ā¢ 8 -
Mixtral of Experts
Paper ā¢ 2401.04088 ā¢ Published ā¢ 158 -
Mistral 7B
Paper ā¢ 2310.06825 ā¢ Published ā¢ 47
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper ā¢ 2402.17764 ā¢ Published ā¢ 607 -
Mixtral of Experts
Paper ā¢ 2401.04088 ā¢ Published ā¢ 158 -
Mistral 7B
Paper ā¢ 2310.06825 ā¢ Published ā¢ 47 -
Don't Make Your LLM an Evaluation Benchmark Cheater
Paper ā¢ 2311.01964 ā¢ Published ā¢ 1