ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published 7 days ago • 48
hamishivi/2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1764018132_step_2450 8B • Updated 4 days ago • 30
hamishivi/2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1764018132_step_2450 8B • Updated 4 days ago • 30
hamishivi/2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1762677729_step_1300 8B • Updated 7 days ago • 26
hamishivi/2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1762677729_step_1300 8B • Updated 7 days ago • 26
hamishivi/2010_rl_rag_NAR8_testing64_gpt5_sft_31605__1__1762886037_checkpoints_step_1300 8B • Updated 8 days ago • 22
hamishivi/2010_rl_rag_NAR8_testing64_gpt5_sft_31605__1__1762886037_checkpoints_step_1300 8B • Updated 8 days ago • 22
RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments Paper • 2511.07317 • Published 23 days ago • 13
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 9 days ago • 53
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 9 days ago • 53
DR Tulu Collection Models and data associated with DR Tulu, http://allenai-web/papers/drtulu • 5 items • Updated 9 days ago • 29
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 9 days ago • 53 • 3
hamishivi/2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1762677729_step1900 8B • Updated 10 days ago • 31
hamishivi/2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1762677729_step1900 8B • Updated 10 days ago • 31
hamishivi/2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1762677729_checkpoints_step_1700 8B • Updated 13 days ago • 121
hamishivi/2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1762677729_checkpoints_step_1700 8B • Updated 13 days ago • 121