MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention Paper • 2407.02490 • Published Jul 2 • 23
Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations Paper • 2406.13632 • Published Jun 19 • 5
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs Paper • 2406.15319 • Published Jun 21 • 60
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length Paper • 2404.08801 • Published Apr 12 • 63
RULER: What's the Real Context Size of Your Long-Context Language Models? Paper • 2404.06654 • Published Apr 9 • 33
Evaluating Language Model Context Windows: A "Working Memory" Test and Inference-time Correction Paper • 2407.03651 • Published Jul 4 • 15
Long Code Arena: a Set of Benchmarks for Long-Context Code Models Paper • 2406.11612 • Published Jun 17 • 22
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs Paper • 2408.07055 • Published Aug 13 • 65
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval Paper • 2409.10516 • Published Sep 16 • 37
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Paper • 2409.16191 • Published Sep 24 • 41
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction Paper • 2409.17422 • Published Sep 25 • 23
L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding? Paper • 2410.02115 • Published Oct 3 • 10