Generation Fast Best-of-N Decoding via Speculative Rejection Paper • 2410.20290 • Published 10 days ago • 8
Long Context Why Does the Effective Context Length of LLMs Fall Short? Paper • 2410.18745 • Published 13 days ago • 16 Language Models can Self-Lengthen to Generate Long Texts Paper • 2410.23933 • Published 6 days ago • 15 ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference Paper • 2410.21465 • Published 9 days ago • 9
Why Does the Effective Context Length of LLMs Fall Short? Paper • 2410.18745 • Published 13 days ago • 16
Language Models can Self-Lengthen to Generate Long Texts Paper • 2410.23933 • Published 6 days ago • 15
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference Paper • 2410.21465 • Published 9 days ago • 9