RUT-Bench Benchmark data in "Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions". Miaow-Lab/RUT-Bench Viewer • Updated 2 days ago • 1.64k • 57 Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions Paper • 2606.03318 • Published 4 days ago
Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions Paper • 2606.03318 • Published 4 days ago
STT-Arena benchmark data, training data, and STT-Agent from our paper "STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics" STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics Paper • 2605.18548 • Published 19 days ago • 1 Miaow-Lab/STT-Agent-SFT 196k • Updated 18 days ago • 30 • 1 Miaow-Lab/STT-Agent-RL 196k • Updated 18 days ago • 31 • 1 Miaow-Lab/STT-Arena Preview • Updated 18 days ago • 101 • 2
STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics Paper • 2605.18548 • Published 19 days ago • 1
RUT-Bench Benchmark data in "Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions". Miaow-Lab/RUT-Bench Viewer • Updated 2 days ago • 1.64k • 57 Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions Paper • 2606.03318 • Published 4 days ago
Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions Paper • 2606.03318 • Published 4 days ago
STT-Arena benchmark data, training data, and STT-Agent from our paper "STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics" STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics Paper • 2605.18548 • Published 19 days ago • 1 Miaow-Lab/STT-Agent-SFT 196k • Updated 18 days ago • 30 • 1 Miaow-Lab/STT-Agent-RL 196k • Updated 18 days ago • 31 • 1 Miaow-Lab/STT-Arena Preview • Updated 18 days ago • 101 • 2
STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics Paper • 2605.18548 • Published 19 days ago • 1