LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference Paper • 2407.14057 • Published Jul 19, 2024 • 45
Speculative Streaming: Fast LLM Inference without Auxiliary Models Paper • 2402.11131 • Published Feb 16, 2024 • 42
FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices with A Simple Super-Resolution Pipeline Paper • 2312.11537 • Published Dec 15, 2023 • 6