Exposing Attention Glitches with Flip-Flop Language Modeling Paper • 2306.00946 • Published Jun 1, 2023 • 2
TinyGSM: achieving >80% on GSM8k with small language models Paper • 2312.09241 • Published Dec 14, 2023 • 37
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression Paper • 2306.00788 • Published Jun 1, 2023
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 254