programbench/20260508_mini-v2.2.6_gpt-5-5-xhigh
Viewer • Updated • 1 • 10
None defined yet.
Given only a compiled binary and its documentation, AI agents must architect and implement a complete codebase that reproduces the original program's behavior. ProgramBench evaluates this capability across 200 real-world open-source projects spanning Rust, Go, C, C++, Haskell, and Java.