Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2509.19894

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 185
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Paper • 2408.06195 • Published Aug 12, 2024 • 73
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR

Paper • 2508.14029 • Published Aug 19 • 118
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

Paper • 2509.25541 • Published 14 days ago • 131

PromptCoT 2.0: Scaling Prompt Synthesis for Large Language Model Reasoning

Paper • 2509.19894 • Published 20 days ago • 31
xl-zhao/PromptCoT-2.0-Prompt-Generation-Model

33B • Updated 18 days ago • 10 • 1
xl-zhao/PromptCoT-2.0-SFT-7B

8B • Updated 18 days ago • 11
xl-zhao/PromptCoT-2.0-SelfPlay-4B

4B • Updated 18 days ago • 17

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2 • 86
BM25S: Orders of magnitude faster lexical search via eager sparse scoring

Paper • 2407.03618 • Published Jul 4, 2024 • 13
Deep Think with Confidence

Paper • 2508.15260 • Published Aug 21 • 87
R-Zero: Self-Evolving Reasoning LLM from Zero Data

Paper • 2508.05004 • Published Aug 7 • 126

about 16 hours ago

Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI

Paper • 2505.19443 • Published May 26 • 15
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs

Paper • 2506.19290 • Published Jun 24 • 52
CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks

Paper • 2105.12655 • Published May 25, 2021
StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 148

about 21 hours ago

Open Data Synthesis For Deep Research

Paper • 2509.00375 • Published Aug 30 • 68
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

Paper • 2509.03403 • Published Sep 3 • 21
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations

Paper • 2509.03405 • Published Sep 3 • 23
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs

Paper • 2509.00930 • Published Aug 31 • 4

Evaluating Very Long-Term Conversational Memory of LLM Agents

Paper • 2402.17753 • Published Feb 27, 2024 • 20
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

Paper • 2402.16671 • Published Feb 26, 2024 • 29
Do Large Language Models Latently Perform Multi-Hop Reasoning?

Paper • 2402.16837 • Published Feb 26, 2024 • 29
Divide-or-Conquer? Which Part Should You Distill Your LLM?

Paper • 2402.15000 • Published Feb 22, 2024 • 24

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 185
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Paper • 2408.06195 • Published Aug 12, 2024 • 73
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR

Paper • 2508.14029 • Published Aug 19 • 118
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

Paper • 2509.25541 • Published 14 days ago • 131

about 16 hours ago

Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI

Paper • 2505.19443 • Published May 26 • 15
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs

Paper • 2506.19290 • Published Jun 24 • 52
CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks

Paper • 2105.12655 • Published May 25, 2021
StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 148

PromptCoT 2.0: Scaling Prompt Synthesis for Large Language Model Reasoning

Paper • 2509.19894 • Published 20 days ago • 31
xl-zhao/PromptCoT-2.0-Prompt-Generation-Model

33B • Updated 18 days ago • 10 • 1
xl-zhao/PromptCoT-2.0-SFT-7B

8B • Updated 18 days ago • 11
xl-zhao/PromptCoT-2.0-SelfPlay-4B

4B • Updated 18 days ago • 17

about 21 hours ago

Open Data Synthesis For Deep Research

Paper • 2509.00375 • Published Aug 30 • 68
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

Paper • 2509.03403 • Published Sep 3 • 21
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations

Paper • 2509.03405 • Published Sep 3 • 23
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs

Paper • 2509.00930 • Published Aug 31 • 4

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2 • 86
BM25S: Orders of magnitude faster lexical search via eager sparse scoring

Paper • 2407.03618 • Published Jul 4, 2024 • 13
Deep Think with Confidence

Paper • 2508.15260 • Published Aug 21 • 87
R-Zero: Self-Evolving Reasoning LLM from Zero Data

Paper • 2508.05004 • Published Aug 7 • 126

Evaluating Very Long-Term Conversational Memory of LLM Agents

Paper • 2402.17753 • Published Feb 27, 2024 • 20
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

Paper • 2402.16671 • Published Feb 26, 2024 • 29
Do Large Language Models Latently Perform Multi-Hop Reasoning?

Paper • 2402.16837 • Published Feb 26, 2024 • 29
Divide-or-Conquer? Which Part Should You Distill Your LLM?

Paper • 2402.15000 • Published Feb 22, 2024 • 24

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs