49 120 992

Jade

euclaise

AI & ML interests

None yet

Recent Activity

liked a model 19 days ago

Mitsua/mitsua-likes

liked a model 19 days ago

tiiuae/Falcon3-10B-Base

liked a model 19 days ago

recursal/QRWKV6-32B-Instruct-Preview-v0.1

View all activity

Organizations

euclaise's activity

upvoted a paper 23 days ago

Normalizing Flows are Capable Generative Models

Paper • 2412.06329 • Published 28 days ago • 8

upvoted 3 papers about 1 month ago

Free Process Rewards without Process Labels

Paper • 2412.01981 • Published Dec 2, 2024 • 29

Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability

Paper • 2411.19943 • Published Nov 29, 2024 • 56

TinyFusion: Diffusion Transformers Learned Shallow

Paper • 2412.01199 • Published Dec 2, 2024 • 14

upvoted a collection about 1 month ago

Skywork-o1-Open

Collection

Skywork o1 open model collections • 3 items • Updated Nov 27, 2024 • 20

upvoted a paper about 1 month ago

Cautious Optimizers: Improving Training with One Line of Code

Paper • 2411.16085 • Published Nov 25, 2024 • 15

upvoted a paper about 2 months ago

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 43

upvoted 4 papers 2 months ago

TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Paper • 2410.23168 • Published Oct 30, 2024 • 24

Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA

Paper • 2410.20672 • Published Oct 28, 2024 • 6

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 89

MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published Oct 22, 2024 • 14

upvoted 2 papers 3 months ago

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

Paper • 2410.05229 • Published Oct 7, 2024 • 22

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 169

upvoted 2 collections 3 months ago

Eurus

Collection

Advancing LLM Reasoning Generalists with Preference Trees • 11 items • Updated Oct 22, 2024 • 24

Mini Pretrain Datasets

Collection

9 items • Updated Jul 9, 2024 • 9

upvoted 2 papers 3 months ago

Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale

Paper • 2409.17115 • Published Sep 25, 2024 • 61

Not All LLM Reasoners Are Created Equal

Paper • 2410.01748 • Published Oct 2, 2024 • 28

upvoted 3 papers 4 months ago

InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning

Paper • 2409.12568 • Published Sep 19, 2024 • 48

EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer

Paper • 2409.10819 • Published Sep 17, 2024 • 18

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 139