Sugato Ray's picture

Sugato Ray

sugatoray

·

https://linkedin.com/in/sugatoray

AI & ML interests

None yet

Recent Activity

updated a collection 1 day ago

LLM Training Datasets

updated a collection 1 day ago

liked a Space 1 day ago

davidberenstein1957/transformers-pipeline-playground

View all activity

Organizations

sugatoray's activity

upvoted a collection 2 days ago

SwiftKV Models

SwiftKV reduces prefill compute by up to 50% by combining model rewiring and knowledge-preserving self-distillation. • 3 items • Updated 30 days ago • 3

upvoted a paper 2 days ago

Xmodel-2 Technical Report

Paper • 2412.19638 • Published 8 days ago • 18

upvoted an article 3 days ago

Article

Fine-tune ModernBERT for text classification using synthetic data

By

•

5 days ago

• 17

upvoted a paper 4 days ago

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published 19 days ago • 44

upvoted a paper 7 days ago

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published 11 days ago • 33

upvoted a paper 9 days ago

GUI Agents: A Survey

Paper • 2412.13501 • Published 17 days ago • 23

upvoted a collection 9 days ago

DeepSeek-V3

2 items • Updated 9 days ago • 91

upvoted a paper 9 days ago

Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published 11 days ago • 41

upvoted a collection 10 days ago

QVQ-72B-Preview

5 items • Updated 10 days ago • 5

upvoted an article 11 days ago

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

By

•

Jul 29, 2024

• 260

upvoted 2 papers 11 days ago

MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design

Paper • 2412.14590 • Published 16 days ago • 13

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Paper • 2412.12094 • Published 18 days ago • 10

upvoted a collection 11 days ago

QwQ

Qwen with Questions • 2 items • Updated Nov 28, 2024 • 53

upvoted a paper 11 days ago

Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents

Paper • 2412.13194 • Published 17 days ago • 12

upvoted 3 papers 14 days ago

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published 17 days ago • 91

Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture

Paper • 2412.11834 • Published 19 days ago • 6

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published 16 days ago • 48

upvoted a paper 15 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 17 days ago • 116

upvoted a collection 15 days ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated 16 days ago • 112

upvoted a collection 16 days ago

OmniEval

An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain • 7 items • Updated 2 days ago • 2