Dataset
updated
DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale
Difficulty-Graded Data Training
Paper
• 2504.17565
• Published
• 2
Viewer
• Updated
• 896k • 2.68k
• 176
PrimeIntellect/synthetic-code-understanding
Viewer
• Updated
• 60.6k • 47
• 19
Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data
Paper
• 2507.07095
• Published
• 56
VeriGUI: Verifiable Long-Chain GUI Dataset
Paper
• 2508.04026
• Published
• 162
Viewer
• Updated
• 408k • 1.73k
• 45
Viewer
• Updated
• 116M • 3.53k
• 182
jupyter-agent/jupyter-agent-dataset
Viewer
• Updated
• 95.8k • 10.7k
• 156
Viewer
• Updated
• 24.2M • 106k
• 470
PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits
Paper
• 2509.11362
• Published
• 5
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform
Data
Paper
• 2509.15221
• Published
• 111
MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods,
Results, Discussion, and Outlook
Paper
• 2509.14142
• Published
• 10
MMAT-1M: A Large Reasoning Dataset for Multimodal Agent Tuning
Paper
• 2507.21924
• Published
• 1
Benchmark
• Updated
• 731 • 47k
• 54
Updated
• 1.11k
• 197
Hierarchical Dataset Selection for High-Quality Data Sharing
Paper
• 2512.10952
• Published
• 2
FiNERweb: Datasets and Artifacts for Scalable Multilingual Named Entity Recognition
Paper
• 2512.13884
• Published
• 15
RedBench: A Universal Dataset for Comprehensive Red Teaming of Large Language Models
Paper
• 2601.03699
• Published
• 8
Extreme Multi-Label Skill Extraction Training using Large Language
Models
Paper
• 2307.10778
• Published
Viewer
• Updated
• 1.9k • 2.25k
• 138
DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
Paper
• 2602.16742
• Published
• 11
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale
Paper
• 2602.23866
• Published
• 57