2 100 55

Raja Biswas

rbiswasfc

AI & ML interests

NLP, Generative AI

Recent Activity

upvoted a paper 9 days ago

RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

liked a model 9 days ago

Qwen/QVQ-72B-Preview

liked a model 9 days ago

deepseek-ai/DeepSeek-V3-Base

View all activity

Articles

Finally, a Replacement for BERT: Introducing ModernBERT

16 days ago

• 426

Organizations

rbiswasfc's activity

upvoted a paper 9 days ago

RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Paper • 2412.14922 • Published 16 days ago • 82

liked 2 models 9 days ago

Qwen/QVQ-72B-Preview

Image-Text-to-Text • Updated 10 days ago • 51.1k • 438

deepseek-ai/DeepSeek-V3-Base

Updated 5 days ago • 7.53k • 1.12k

updated a dataset 10 days ago

rbiswasfc/eedi-awq-calibration-tutor

Viewer • Updated 10 days ago • 128 • 41

authored a paper 15 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 17 days ago • 116

upvoted 2 papers 15 days ago

No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published 19 days ago • 41

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 15 days ago • 334

upvoted a collection 15 days ago

ModernBERT

Collection

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated 16 days ago • 112

upvoted a paper 15 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 17 days ago • 116

liked 2 models 15 days ago

answerdotai/ModernBERT-large

Fill-Mask • Updated 9 days ago • 23.8k • 284

answerdotai/ModernBERT-base

Fill-Mask • Updated 9 days ago • 77.7k • 592

liked a Space 16 days ago

Running

423

📈

Scaling test-time compute

upvoted 3 papers 20 days ago

updated a dataset 26 days ago

rbiswasfc/eedi-awq-calibration-cot

Viewer • Updated 26 days ago • 1.02k • 16

updated a dataset about 1 month ago

rbiswasfc/eedi-awq-calibration

Viewer • Updated Nov 27, 2024 • 1.02k • 12

upvoted 3 papers about 2 months ago

Can Knowledge Editing Really Correct Hallucinations?

Paper • 2410.16251 • Published Oct 21, 2024 • 54

CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation

Paper • 2410.23090 • Published Oct 30, 2024 • 54

What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

Paper • 2410.23743 • Published Oct 31, 2024 • 59