Mantis Collection Mantis model family optimized for multi-image reasoning with interleaved text/image format • 11 items • Updated Jul 2, 2024 • 9
PHAnToM: Personality Has An Effect on Theory-of-Mind Reasoning in Large Language Models Paper • 2403.02246 • Published Mar 4, 2024 • 1
Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation Paper • 2403.16422 • Published Mar 25, 2024 • 1
Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D Gaussians Paper • 2403.17898 • Published Mar 26, 2024 • 15
GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers Paper • 2409.04196 • Published Sep 6, 2024 • 14
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency Paper • 2409.02634 • Published Sep 4, 2024 • 92
Sapiens Collection Foundation models for human tasks. Code: https://github.com/facebookresearch/sapiens • 72 items • Updated Sep 18, 2024 • 51
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24, 2024 • 184