HippoCamp: Benchmarking Contextual Agents on Personal Computers Paper • 2604.01221 • Published 3 days ago • 24
PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning Paper • 2603.26653 • Published 8 days ago • 15
Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models Paper • 2603.13985 • Published 21 days ago • 10
Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models Paper • 2603.13985 • Published 21 days ago • 10