Towards Few-Shot Adaptation of Foundation Models via Multitask Finetuning Paper • 2402.15017 • Published Feb 22, 2024
Out-of-distribution generalization via composition: a lens through induction heads in Transformers Paper • 2408.09503 • Published Aug 18, 2024