Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.13026

ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models

Paper • 2403.01807 • Published Mar 4, 2024 • 8
TripoSR: Fast 3D Object Reconstruction from a Single Image

Paper • 2403.02151 • Published Mar 4, 2024 • 13
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Paper • 2403.01779 • Published Mar 4, 2024 • 29
MagicClay: Sculpting Meshes With Generative Neural Fields

Paper • 2403.02460 • Published Mar 4, 2024 • 8

Seamless Human Motion Composition with Blended Positional Encodings

Paper • 2402.15509 • Published Feb 23, 2024 • 14
TripoSR: Fast 3D Object Reconstruction from a Single Image

Paper • 2403.02151 • Published Mar 4, 2024 • 13
3D-VLA: A 3D Vision-Language-Action Generative World Model

Paper • 2403.09631 • Published Mar 14, 2024 • 8
Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting

Paper • 2403.09981 • Published Mar 15, 2024 • 7

Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling

Paper • 2401.15977 • Published Jan 29, 2024 • 38
Lumiere: A Space-Time Diffusion Model for Video Generation

Paper • 2401.12945 • Published Jan 23, 2024 • 86
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning

Paper • 2307.04725 • Published Jul 10, 2023 • 64
Boximator: Generating Rich and Controllable Motions for Video Synthesis

Paper • 2402.01566 • Published Feb 2, 2024 • 27

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation

Paper • 2401.04092 • Published Jan 8, 2024 • 21
AToM: Amortized Text-to-Mesh using 2D Diffusion

Paper • 2402.00867 • Published Feb 1, 2024 • 11
Advances in 3D Generation: A Survey

Paper • 2401.17807 • Published Jan 31, 2024 • 19
SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding

Paper • 2401.09340 • Published Jan 17, 2024 • 20

Masked Audio Generation using a Single Non-Autoregressive Transformer

Paper • 2401.04577 • Published Jan 9, 2024 • 43
PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation

Paper • 2404.13026 • Published Apr 19, 2024 • 24

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

Paper • 2312.16862 • Published Dec 28, 2023 • 31
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action

Paper • 2312.17172 • Published Dec 28, 2023 • 28
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers

Paper • 2401.01974 • Published Jan 3, 2024 • 7
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Paper • 2401.01885 • Published Jan 3, 2024 • 28

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

Paper • 2306.07967 • Published Jun 13, 2023 • 24
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation

Paper • 2306.07954 • Published Jun 13, 2023 • 112
TryOnDiffusion: A Tale of Two UNets

Paper • 2306.08276 • Published Jun 14, 2023 • 72
Seeing the World through Your Eyes

Paper • 2306.09348 • Published Jun 15, 2023 • 33

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs