pythagoras's picture

14 2

pythagoras

dingangui

dingangui

AI & ML interests

None yet

Recent Activity

upvoted a paper 23 days ago

GenEx: Generating an Explorable World

upvoted a paper about 1 month ago

ZipAR: Accelerating Autoregressive Image Generation through Spatial Locality

upvoted a paper about 1 month ago

Mimir: Improving Video Diffusion Models for Precise Text Understanding

View all activity

Organizations

dingangui's activity

upvoted a paper 23 days ago

GenEx: Generating an Explorable World

Paper • 2412.09624 • Published 26 days ago • 87

upvoted 2 papers about 1 month ago

ZipAR: Accelerating Autoregressive Image Generation through Spatial Locality

Paper • 2412.04062 • Published Dec 5, 2024 • 7

Mimir: Improving Video Diffusion Models for Precise Text Understanding

Paper • 2412.03085 • Published Dec 4, 2024 • 12

upvoted 2 papers 2 months ago

Framer: Interactive Frame Interpolation

Paper • 2410.18978 • Published Oct 24, 2024 • 36

MarDini: Masked Autoregressive Diffusion for Video Generation at Scale

Paper • 2410.20280 • Published Oct 26, 2024 • 23

upvoted 5 papers 3 months ago

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

Paper • 2410.14669 • Published Oct 18, 2024 • 36

MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation

Paper • 2410.11779 • Published Oct 15, 2024 • 25

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Paper • 2408.15998 • Published Aug 28, 2024 • 84

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 58

Seeing Faces in Things: A Model and Dataset for Pareidolia

Paper • 2409.16143 • Published Sep 24, 2024 • 17

upvoted 3 papers 4 months ago

PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

Paper • 2409.15278 • Published Sep 23, 2024 • 24

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published Sep 18, 2024 • 76

CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation

Paper • 2409.03643 • Published Sep 5, 2024 • 19

upvoted a paper 5 months ago

Real-Time Video Generation with Pyramid Attention Broadcast

Paper • 2408.12588 • Published Aug 22, 2024 • 16