-
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response
Paper • 2412.14922 • Published • 82 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 42 -
Deliberation in Latent Space via Differentiable Cache Augmentation
Paper • 2412.17747 • Published • 28 -
Outcome-Refining Process Supervision for Code Generation
Paper • 2412.15118 • Published • 19
Robin Williams PRO
bfuzzy1
AI & ML interests
None yet
Recent Activity
updated
a collection
about 5 hours ago
acheron
updated
a model
about 6 hours ago
bfuzzy1/acheron-d
updated
a model
about 7 hours ago
bfuzzy1/llambses-1
Organizations
None yet
Collections
11
llambses-1 models