MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions Paper • 2410.02743 • Published Oct 3, 2024 • 7 • 2
Tokenization Falling Short: The Curse of Tokenization Paper • 2406.11687 • Published Jun 17, 2024 • 15 • 1