Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.17177

Daily paper that worth reading in details later

Neural Network Diffusion

Paper • 2402.13144 • Published Feb 20, 2024 • 95
Genie: Generative Interactive Environments

Paper • 2402.15391 • Published Feb 23, 2024 • 70
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

Paper • 2402.17177 • Published Feb 27, 2024 • 88
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks

Paper • 2403.00522 • Published Mar 1, 2024 • 44

Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling

Paper • 2401.15977 • Published Jan 29, 2024 • 37
Lumiere: A Space-Time Diffusion Model for Video Generation

Paper • 2401.12945 • Published Jan 23, 2024 • 86
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning

Paper • 2307.04725 • Published Jul 10, 2023 • 64
Boximator: Generating Rich and Controllable Motions for Video Synthesis

Paper • 2402.01566 • Published Feb 2, 2024 • 26

3D Avatar Utils

Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance

Paper • 2401.15687 • Published Jan 28, 2024 • 23
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians

Paper • 2312.03029 • Published Dec 5, 2023 • 23
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation

Paper • 2312.13578 • Published Dec 21, 2023 • 27
Splatter Image: Ultra-Fast Single-View 3D Reconstruction

Paper • 2312.13150 • Published Dec 20, 2023 • 14

about 14 hours ago

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

Paper • 2401.09985 • Published Jan 18, 2024 • 15
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects

Paper • 2401.09962 • Published Jan 18, 2024 • 8
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution

Paper • 2401.10404 • Published Jan 18, 2024 • 10
ActAnywhere: Subject-Aware Video Background Generation

Paper • 2401.10822 • Published Jan 19, 2024 • 13

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 181
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Paper • 2401.04658 • Published Jan 9, 2024 • 25
Weaver: Foundation Models for Creative Writing

Paper • 2401.17268 • Published Jan 30, 2024 • 43
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30, 2024 • 17

PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models

Paper • 2312.13964 • Published Dec 21, 2023 • 18
LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 257
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation

Paper • 2312.12491 • Published Dec 19, 2023 • 69
LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model

Paper • 2401.02330 • Published Jan 4, 2024 • 14

paper to review

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence

Paper • 2312.02087 • Published Dec 4, 2023 • 20
FaceStudio: Put Your Face Everywhere in Seconds

Paper • 2312.02663 • Published Dec 5, 2023 • 30
Orthogonal Adaptation for Modular Customization of Diffusion Models

Paper • 2312.02432 • Published Dec 5, 2023 • 12
ReconFusion: 3D Reconstruction with Diffusion Priors

Paper • 2312.02981 • Published Dec 5, 2023 • 8

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

Paper • 2306.07967 • Published Jun 13, 2023 • 24
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation

Paper • 2306.07954 • Published Jun 13, 2023 • 112
TryOnDiffusion: A Tale of Two UNets

Paper • 2306.08276 • Published Jun 14, 2023 • 72
Seeing the World through Your Eyes

Paper • 2306.09348 • Published Jun 15, 2023 • 33

FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline

Paper • 2311.13073 • Published Nov 22, 2023 • 56
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

Paper • 2402.17177 • Published Feb 27, 2024 • 88

Can LLMs Follow Simple Rules?

Paper • 2311.04235 • Published Nov 6, 2023 • 10
The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26, 2024 • 78
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 183
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

Paper • 2402.17177 • Published Feb 27, 2024 • 88

Previous
1
2
3
4
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs