Post
424
Announce π WebApp1K-Duo π
onekq-ai/WebApp1K-Duo-React
This is to keep up the challenge after OpenAI o1 models saturated the WebApp1K benchmark. The new benchmark brings SOTA to 67%. Let the hill climbing commence!
onekq-ai/WebApp1K-models-leaderboard
PS: I will publish more findings soon.
onekq-ai/WebApp1K-Duo-React
This is to keep up the challenge after OpenAI o1 models saturated the WebApp1K benchmark. The new benchmark brings SOTA to 67%. Let the hill climbing commence!
onekq-ai/WebApp1K-models-leaderboard
PS: I will publish more findings soon.