Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models Paper • 2601.07372 • Published 19 days ago • 40
UltraRAG: A Modular and Automated Toolkit for Adaptive Retrieval-Augmented Generation Paper • 2504.08761 • Published Mar 31, 2025 • 7
What Matters in Data Curation for Multimodal Reasoning? Insights from the DCVLR Challenge Paper • 2601.10922 • Published 15 days ago • 3
Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning Paper • 2601.13697 • Published 11 days ago • 3
FantasyVLN: Unified Multimodal Chain-of-Thought Reasoning for Vision-Language Navigation Paper • 2601.13976 • Published 10 days ago • 21
Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization Paper • 2601.12993 • Published 11 days ago • 75
Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking Paper • 2601.04720 • Published 23 days ago • 51
YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection Paper • 2512.23273 • Published Dec 29, 2025 • 14
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI Paper • 2512.16676 • Published Dec 18, 2025 • 214
Few-Step Distillation for Text-to-Image Generation: A Practical Guide Paper • 2512.13006 • Published Dec 15, 2025 • 8
RF-DETR: Neural Architecture Search for Real-Time Detection Transformers Paper • 2511.09554 • Published Nov 12, 2025 • 8
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published Dec 18, 2025 • 85
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling Paper • 2512.14614 • Published Dec 16, 2025 • 71
In Pursuit of Pixel Supervision for Visual Pre-training Paper • 2512.15715 • Published Dec 17, 2025 • 11