Commit History

Rename to Hide Standard Errors
585c3fa
verified

albertvillanova HF staff commited on

Use color map for Results metrics values
581682a
verified

albertvillanova HF staff commited on

Add checkbox in Details to show only differences
6cf57e4
verified

albertvillanova HF staff commited on

Add checkbox in Configs to show only differences
f12aa56
verified

albertvillanova HF staff commited on

Add checkbox in Results to hide stderr
54e105e
verified

albertvillanova HF staff commited on

Implement login for GPQA Details
26ef426
verified

albertvillanova HF staff commited on

Fix loading Details with documents containing end of lines
662ed4b
verified

albertvillanova HF staff commited on

Fix wrapping to keep non-str data
e3edf6d
verified

albertvillanova HF staff commited on

Make beta-version warning less formal
19a6010
verified

albertvillanova HF staff commited on

Escape HTML tags in data
bd64e7a
verified

albertvillanova HF staff commited on

Fix URL to Leaderboard
7647125
verified

albertvillanova HF staff commited on

Display loading message
8f7c83f
verified

albertvillanova HF staff commited on

Add additional info to task description
651545d
verified

albertvillanova HF staff commited on

Import contants as submodule
30a0c61
verified

albertvillanova HF staff commited on

Improve label of subtasks
b6f3b94
verified

albertvillanova HF staff commited on

Add description of Tasks
ca2b34f
verified

albertvillanova HF staff commited on

Hide Details for GPQA task
5009abb
verified

albertvillanova HF staff commited on

Fix Details subtask info
daff9c0
verified

albertvillanova HF staff commited on

Add warning as beta version
c1fc7f4
verified

albertvillanova HF staff commited on

Use magenta instead of red color
33d0dfb
verified

albertvillanova HF staff commited on

Load Details asynchronously
2f4d877
verified

albertvillanova HF staff commited on

Load results asynchronously
d0f55c6
verified

albertvillanova HF staff commited on

Remove unnecessary iteration
da4a3b1
verified

albertvillanova HF staff commited on

Remove ARC task by hiding from All
6099782
verified

albertvillanova HF staff commited on

Add description of the Space
7a0e5b8
verified

albertvillanova HF staff commited on

Pass fill_width to Gradio
4289e9d
verified

albertvillanova HF staff commited on

Change app emoji, link Results dataset and add tag
131a4d7
verified

albertvillanova HF staff commited on

Highlight exact_match and change colors
a4b20f4
verified

albertvillanova HF staff commited on

Highlight min/max Results accuracy
26e855f
verified

albertvillanova HF staff commited on

Fix overflow in Details
0a4c821
verified

albertvillanova HF staff commited on

Fix missing results by reading all files
8e404a5
verified

albertvillanova HF staff commited on

Fix clear_details for load_details_btn
1c1cb58
verified

albertvillanova HF staff commited on

Put Results and Configs tabs at top level and sync
9c39267
verified

albertvillanova HF staff commited on

Call update_load_results_component with multiple triggers
1f43e72
verified

albertvillanova HF staff commited on

Make Results tasks (in)visible
bf6ab81
verified

albertvillanova HF staff commited on

Remove All from Details tasks and rephrase
71dfe85
verified

albertvillanova HF staff commited on

Move Results Tasks below buttons
0e93f79
verified

albertvillanova HF staff commited on

Fix latest_result_path_per_model variable and refactor
c2c9efa
verified

albertvillanova HF staff commited on

Create src.details module
841e241
verified

albertvillanova HF staff commited on

Create src.results module
15c8167
verified

albertvillanova HF staff commited on

Create constants module
d9f31f1
verified

albertvillanova HF staff commited on

Add Clear Details button
07448fb
verified

albertvillanova HF staff commited on

Call display_details on multiple triggers
0d84f54
verified

albertvillanova HF staff commited on

Add Clear Results button
54202cb
verified

albertvillanova HF staff commited on

Trigger display_results on multiple triggers
8f68cc2
verified

albertvillanova HF staff commited on

Make display_results robust
7e19f96
verified

albertvillanova HF staff commited on