B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published 14 days ago • 44
M-STAR Collection Resources of M-STAR (Multimodal Self-Evolving Training for Reasoning) https://mstar-lmm.github.io/ • 2 items • Updated 12 days ago • 2
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation Paper • 2304.05977 • Published Apr 12, 2023 • 1
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving Paper • 2407.13690 • Published Jun 18, 2024 • 2