PlayAI
's Collections
Interesting Papers
updated
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge
in RAG Systems
Paper
•
2411.02959
•
Published
•
64
GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single
In-the-Wild Image using a Dataset with Levels of Details
Paper
•
2411.03047
•
Published
•
8
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D
Paper
•
2411.02336
•
Published
•
23
GenXD: Generating Any 3D and 4D Scenes
Paper
•
2411.02319
•
Published
•
20
Fashion-VDM: Video Diffusion Model for Virtual Try-On
Paper
•
2411.00225
•
Published
•
9
Face Anonymization Made Simple
Paper
•
2411.00762
•
Published
•
7
HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level
and Fidelity-Rich Conditions in Diffusion Models
Paper
•
2410.22901
•
Published
•
8
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse
Autoencoders
Paper
•
2410.22366
•
Published
•
77
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe
Dataset Curation
Paper
•
2410.18666
•
Published
•
19
Emu3: Next-Token Prediction is All You Need
Paper
•
2409.18869
•
Published
•
94
Hymba: A Hybrid-head Architecture for Small Language Models
Paper
•
2411.13676
•
Published
•
39
FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations
Paper
•
2411.10818
•
Published
•
24
RedPajama: an Open Dataset for Training Large Language Models
Paper
•
2411.12372
•
Published
•
47
Generative World Explorer
Paper
•
2411.11844
•
Published
•
75
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large
Language Models on Mobile Devices
Paper
•
2411.10640
•
Published
•
44
AnimateAnything: Consistent and Controllable Animation for Video
Generation
Paper
•
2411.10836
•
Published
•
23
SlimLM: An Efficient Small Language Model for On-Device Document
Assistance
Paper
•
2411.09944
•
Published
•
12
FitDiT: Advancing the Authentic Garment Details for High-fidelity
Virtual Try-on
Paper
•
2411.10499
•
Published
•
13
StableV2V: Stablizing Shape Consistency in Video-to-Video Editing
Paper
•
2411.11045
•
Published
•
11
Region-Aware Text-to-Image Generation via Hard Binding and Soft
Refinement
Paper
•
2411.06558
•
Published
•
34
LLaVA-o1: Let Vision Language Models Reason Step-by-Step
Paper
•
2411.10440
•
Published
•
111
Cut Your Losses in Large-Vocabulary Language Models
Paper
•
2411.09009
•
Published
•
43
Enhancing the Reasoning Ability of Multimodal Large Language Models via
Mixed Preference Optimization
Paper
•
2411.10442
•
Published
•
68
From CISC to RISC: language-model guided assembly transpilation
Paper
•
2411.16341
•
Published
•
11
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training
Paper
•
2411.15124
•
Published
•
57
MoViE: Mobile Diffusion for Video Editing
Paper
•
2412.06578
•
Published
•
18
GraPE: A Generate-Plan-Edit Framework for Compositional T2I Synthesis
Paper
•
2412.06089
•
Published
•
4
Mogo: RQ Hierarchical Causal Transformer for High-Quality 3D Human
Motion Generation
Paper
•
2412.07797
•
Published
•
11
No More Adam: Learning Rate Scaling at Initialization is All You Need
Paper
•
2412.11768
•
Published
•
41
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained
Evidence within Generation
Paper
•
2412.11919
•
Published
•
33