view article Article Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques jmamou • Mar 24, 2025 • 20
view article Article Faster Assisted Generation with Dynamic Speculation +5 jmamou, orenpereg, joaogante, lewtun, danielkorat, Nadav-Timor, moshew • Oct 8, 2024 • 51
Distributed Speculative Inference of Large Language Models Paper • 2405.14105 • Published May 23, 2024 • 18