Spaces:

gaia-benchmark
/

leaderboard

Running on CPU Upgrade

Clémentine commited on Nov 15, 2023

Commit

b838ed1

•

1 Parent(s): 01d1bbb

text reorg

Files changed (1) hide show

content.py CHANGED Viewed

@@ -3,12 +3,13 @@ TITLE = """<h1 align="center" id="space-title">GAIA Leaderboard</h1>"""
 INTRODUCTION_TEXT = """
 GAIA is a benchmark which aims at evaluating next-generation LLMs (LLMs with augmented capabilities due to added tooling, efficient prompting, access to search, etc). (See our paper for more details.)
-## Context
-GAIA is made of more than 450 non-trivial question with an unambiguous answer, requiring different levels of tooling and autonomy to solve. GAIA data can be found in this space (https://huggingface.co/datasets/gaia-benchmark/GAIA). Questions are contained in `metadata.jsonl`. Some questions come with an additional file, that can be found in the same folder and whose id is given in the field `file_name`.
-It is divided in 3 levels, where level 1 should be breakable by very good LLMs, and level 3 indicate a strong jump in model capabilities, each divided into a fully public dev set for validation, and a test set with private answers and metadata.
-# Submissions
 Results can be submitted for both validation and test. Scores are expressed as the percentage of correct answers for a given split.
 We expect submissions to be json-line files with the following format. The first two fields are mandatory, `reasoning_trace` is optionnal:

 INTRODUCTION_TEXT = """
 GAIA is a benchmark which aims at evaluating next-generation LLMs (LLMs with augmented capabilities due to added tooling, efficient prompting, access to search, etc). (See our paper for more details.)
+## Data
+GAIA is made of more than 450 non-trivial question with an unambiguous answer, requiring different levels of tooling and autonomy to solve.
+It is therefore divided in 3 levels, where level 1 should be breakable by very good LLMs, and level 3 indicate a strong jump in model capabilities. Each level is divided into a fully public dev set for validation, and a test set with private answers and metadata.
+GAIA data can be found in this space (https://huggingface.co/datasets/gaia-benchmark/GAIA). Questions are contained in `metadata.jsonl`. Some questions come with an additional file, that can be found in the same folder and whose id is given in the field `file_name`.
+## Submissions
 Results can be submitted for both validation and test. Scores are expressed as the percentage of correct answers for a given split.
 We expect submissions to be json-line files with the following format. The first two fields are mandatory, `reasoning_trace` is optionnal: