Mahdi Pourmirzaei's picture

80

Mahdi Pourmirzaei

Mahdip72

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 20 hours ago

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

upvoted a paper 19 days ago

Byte Latent Transformer: Patches Scale Better Than Tokens

upvoted a paper 24 days ago

Multimodal Latent Language Modeling with Next-Token Diffusion

View all activity

Organizations

None yet

Mahdip72's activity

upvoted a paper about 20 hours ago

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 66

upvoted a paper 19 days ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published 24 days ago • 81

upvoted a paper 24 days ago

Multimodal Latent Language Modeling with Next-Token Diffusion

Paper • 2412.08635 • Published 25 days ago • 41

upvoted a paper 28 days ago

SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration

Paper • 2411.10958 • Published Nov 17, 2024 • 52

upvoted 2 papers about 2 months ago

The Surprising Effectiveness of Test-Time Training for Abstract Reasoning

Paper • 2411.07279 • Published Nov 11, 2024 • 3

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 43

upvoted 5 papers 3 months ago

Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation

Paper • 2410.13848 • Published Oct 17, 2024 • 32

Quantifying Generalization Complexity for Large Language Models

Paper • 2410.01769 • Published Oct 2, 2024 • 13

Loong: Generating Minute-level Long Videos with Autoregressive Language Models

Paper • 2410.02757 • Published Oct 3, 2024 • 36

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 144

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

Paper • 2410.02073 • Published Oct 2, 2024 • 41

updated a model 3 months ago

Mahdip72/prot2token

Updated Oct 5, 2024

upvoted 3 papers 3 months ago

MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26, 2024 • 53

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27, 2024 • 94

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 58

upvoted 5 papers 4 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 136

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 139

Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models

Paper • 2408.06663 • Published Aug 13, 2024 • 15

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark

Paper • 2409.02813 • Published Sep 4, 2024 • 28

Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 121