Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 11 days ago • 83
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published 27 days ago • 39 • 26
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published 27 days ago • 39 • 26
Position Information Emerges in Causal Transformers Without Positional Encodings via Similarity of Nearby Embeddings Paper • 2501.00073 • Published 21 days ago • 1
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models Paper • 2412.07171 • Published Dec 10, 2024 • 1 • 1
Lee's RoPE Tricks / Context Extension Reads Collection Set of Long Context (RoPE or otherwise) I'm collecting off of HF • 45 items • Updated 16 days ago • 3
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models Paper • 2412.07171 • Published Dec 10, 2024 • 1
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding Paper • 2501.00712 • Published 19 days ago • 6 • 4
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding Paper • 2501.00712 • Published 19 days ago • 6
Lee's RoPE Tricks / Context Extension Reads Collection Set of Long Context (RoPE or otherwise) I'm collecting off of HF • 45 items • Updated 16 days ago • 3
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing Paper • 2501.00658 • Published 19 days ago • 7