-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 28 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
Collections
Discover the best community collections!
Collections including paper arXiv:2510.24701
-
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 220 -
Tongyi DeepResearch Technical Report
Paper • 2510.24701 • Published • 90 -
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-ppo-v0.3
3B • Updated • 259 -
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-grpo-v0.3
3B • Updated • 1.37k • 1
-
SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents
Paper • 2509.06283 • Published • 17 -
Alibaba-NLP/Tongyi-DeepResearch-30B-A3B
Text Generation • 31B • Updated • 12.5k • 757 -
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 71 -
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 68
-
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Paper • 2402.15506 • Published • 18 -
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
Paper • 2404.03648 • Published • 30 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 33 -
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Paper • 2405.19888 • Published • 7
-
RL makes MLLMs see better than SFT
Paper • 2510.16333 • Published • 47 -
Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback
Paper • 2510.16888 • Published • 19 -
Reasoning with Sampling: Your Base Model is Smarter Than You Think
Paper • 2510.14901 • Published • 46 -
Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation
Paper • 2510.21583 • Published • 30
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 28 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Paper • 2402.15506 • Published • 18 -
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
Paper • 2404.03648 • Published • 30 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 33 -
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Paper • 2405.19888 • Published • 7
-
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 220 -
Tongyi DeepResearch Technical Report
Paper • 2510.24701 • Published • 90 -
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-ppo-v0.3
3B • Updated • 259 -
PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-grpo-v0.3
3B • Updated • 1.37k • 1
-
RL makes MLLMs see better than SFT
Paper • 2510.16333 • Published • 47 -
Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback
Paper • 2510.16888 • Published • 19 -
Reasoning with Sampling: Your Base Model is Smarter Than You Think
Paper • 2510.14901 • Published • 46 -
Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation
Paper • 2510.21583 • Published • 30
-
SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents
Paper • 2509.06283 • Published • 17 -
Alibaba-NLP/Tongyi-DeepResearch-30B-A3B
Text Generation • 31B • Updated • 12.5k • 757 -
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 71 -
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 68