On Computational Limits and Provably Efficient Criteria of Visual Autoregressive Models: A Fine-Grained Complexity Analysis Paper • 2501.04377 • Published 24 days ago • 14 • 2
Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time Paper • 2408.13233 • Published Aug 23, 2024 • 24 • 4
Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time Paper • 2408.13233 • Published Aug 23, 2024 • 24 • 4