Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arXiv:2510.24701

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 28
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 220
Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published 13 days ago • 90
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-ppo-v0.3

3B • Updated May 21 • 259
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-grpo-v0.3

3B • Updated May 21 • 1.37k • 1

A Definition of AGI

Paper • 2510.18212 • Published 21 days ago • 33
Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published 13 days ago • 90

Papers, datasets and models on deep research agents

SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents

Paper • 2509.06283 • Published Sep 8 • 17
Alibaba-NLP/Tongyi-DeepResearch-30B-A3B

Text Generation • 31B • Updated Oct 10 • 12.5k • 757
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

Paper • 2506.11763 • Published Jun 13 • 71
Open Data Synthesis For Deep Research

Paper • 2509.00375 • Published Aug 30 • 68

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 18
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4, 2024 • 30
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 33
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30, 2024 • 7

Agentic AI Training and Tuning

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published 13 days ago • 90
Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 11 days ago • 102

RL makes MLLMs see better than SFT

Paper • 2510.16333 • Published 24 days ago • 47
Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback

Paper • 2510.16888 • Published 23 days ago • 19
Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published 25 days ago • 46
Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation

Paper • 2510.21583 • Published 18 days ago • 30

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 28
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 18
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4, 2024 • 30
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 33
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30, 2024 • 7

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 220
Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published 13 days ago • 90
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-ppo-v0.3

3B • Updated May 21 • 259
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-grpo-v0.3

3B • Updated May 21 • 1.37k • 1

Agentic AI Training and Tuning

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published 13 days ago • 90
Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 11 days ago • 102

A Definition of AGI

Paper • 2510.18212 • Published 21 days ago • 33
Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published 13 days ago • 90

RL makes MLLMs see better than SFT

Paper • 2510.16333 • Published 24 days ago • 47
Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback

Paper • 2510.16888 • Published 23 days ago • 19
Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published 25 days ago • 46
Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation

Paper • 2510.21583 • Published 18 days ago • 30

Papers, datasets and models on deep research agents

SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents

Paper • 2509.06283 • Published Sep 8 • 17
Alibaba-NLP/Tongyi-DeepResearch-30B-A3B

Text Generation • 31B • Updated Oct 10 • 12.5k • 757
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

Paper • 2506.11763 • Published Jun 13 • 71
Open Data Synthesis For Deep Research

Paper • 2509.00375 • Published Aug 30 • 68

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs