-
RoFormer: Enhanced Transformer with Rotary Position Embedding
Paper • 2104.09864 • Published • 11 -
Attention Is All You Need
Paper • 1706.03762 • Published • 50 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 61 -
Zero-Shot Tokenizer Transfer
Paper • 2405.07883 • Published • 5
Collections
Discover the best community collections!
Collections including paper arxiv:2401.06118
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 21 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 12 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 66
-
SqueezeLLM: Dense-and-Sparse Quantization
Paper • 2306.07629 • Published • 4 -
Norm Tweaking: High-performance Low-bit Quantization of Large Language Models
Paper • 2309.02784 • Published • 1 -
Extreme Compression of Large Language Models via Additive Quantization
Paper • 2401.06118 • Published • 12 -
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Paper • 2402.04291 • Published • 49
-
QuIP: 2-Bit Quantization of Large Language Models With Guarantees
Paper • 2307.13304 • Published • 2 -
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
Paper • 2306.03078 • Published • 3 -
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
Paper • 2308.13137 • Published • 17 -
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Paper • 2306.00978 • Published • 9