Collections
Discover the best community collections!
Collections including paper arxiv:2405.18386
-
AtP*: An efficient and scalable method for localizing LLM behaviour to components
Paper • 2403.00745 • Published • 12 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 605 -
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Paper • 2402.16840 • Published • 23 -
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper • 2402.13753 • Published • 114
-
Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion
Paper • 2402.03162 • Published • 17 -
InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions
Paper • 2402.03040 • Published • 17 -
Magic-Me: Identity-Specific Video Customized Diffusion
Paper • 2402.09368 • Published • 27 -
LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing
Paper • 2402.10294 • Published • 24
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 15 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 8 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13
-
Retrieval-Augmented Text-to-Audio Generation
Paper • 2309.08051 • Published • 6 -
A Large-scale Dataset for Audio-Language Representation Learning
Paper • 2309.11500 • Published • 9 -
End-to-End Speech Recognition Contextualization with Large Language Models
Paper • 2309.10917 • Published • 9 -
FoleyGen: Visually-Guided Audio Generation
Paper • 2309.10537 • Published • 8