LLM Compression - a TonyMou Collection

TonyMou 's Collections

LLM Compression

VLM Token Compression

LLM Compression

updated Nov 14, 2024

Quantization、Prunning、Distillation

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Paper • 2407.11062 • Published Jul 10, 2024 • 8
PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs

Paper • 2410.05265 • Published Oct 7, 2024 • 30
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models

Paper • 2308.13137 • Published Aug 25, 2023 • 17