LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers Paper • 2310.03294 • Published Oct 5, 2023 • 2
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena Paper • 2306.05685 • Published Jun 9, 2023 • 32
Evaluating the Robustness of Text-to-image Diffusion Models against Real-world Attacks Paper • 2306.13103 • Published Jun 16, 2023 • 2
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving Paper • 2401.09670 • Published Jan 18, 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding Paper • 2402.02057 • Published Feb 3, 2024
Efficient Memory Management for Large Language Model Serving with PagedAttention Paper • 2309.06180 • Published Sep 12, 2023 • 25
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference Paper • 2403.04132 • Published Mar 7, 2024 • 38
Toward Inference-optimal Mixture-of-Expert Large Language Models Paper • 2404.02852 • Published Apr 3, 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length Paper • 2404.08801 • Published Apr 12, 2024 • 64
MPCFormer: fast, performant and private Transformer inference with MPC Paper • 2211.01452 • Published Nov 2, 2022 • 1
Optimizing Speculative Decoding for Serving Large Language Models Using Goodput Paper • 2406.14066 • Published Jun 20, 2024 • 1
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models Paper • 2406.13233 • Published Jun 19, 2024 • 1
Specifications: The missing link to making the development of LLM systems an engineering discipline Paper • 2412.05299 • Published Nov 25, 2024 • 1
Efficiently Serving LLM Reasoning Programs with Certaindex Paper • 2412.20993 • Published 7 days ago • 29
Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence Paper • 2305.14334 • Published May 23, 2023 • 1
See, Say, and Segment: Teaching LMMs to Overcome False Premises Paper • 2312.08366 • Published Dec 13, 2023
VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models Paper • 2410.12851 • Published Oct 10, 2024 • 1