-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 605 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 96 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 104 -
TransformerFAM: Feedback attention is working memory
Paper • 2404.09173 • Published • 43
Collections
Discover the best community collections!
Collections including paper arxiv:2410.16144
-
You Only Cache Once: Decoder-Decoder Architectures for Language Models
Paper • 2405.05254 • Published • 10 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 605 -
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
Paper • 2406.04333 • Published • 37 -
1-bit AI Infra: Part 1.1, Fast and Lossless BitNet b1.58 Inference on CPUs
Paper • 2410.16144 • Published • 3
-
Can large language models explore in-context?
Paper • 2403.15371 • Published • 32 -
GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling
Paper • 2403.19655 • Published • 18 -
WavLLM: Towards Robust and Adaptive Speech Large Language Model
Paper • 2404.00656 • Published • 10 -
Enabling Memory Safety of C Programs using LLMs
Paper • 2404.01096 • Published • 1