open_pt_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

eduagarcia commited on Feb 22

Commit

43c2b1a

•

1 Parent(s): aa7060a

add dynamic documentation for RAW_RESULTS_REPO

Browse files

Files changed (1) hide show

src/display/about.py +4 -4

src/display/about.py CHANGED Viewed

@@ -1,6 +1,6 @@
 from src.display.utils import ModelType
 from src.display.utils import Tasks
-from src.envs import REPO_ID, QUEUE_REPO, RESULTS_REPO, PATH_TO_COLLECTION, LEADERBOARD_NAME, TRUST_REMOTE_CODE, TASK_CONFIG
 LM_EVAL_URL = "https://github.com/eduagarcia/lm-evaluation-harness-pt"
@@ -72,7 +72,7 @@ We chose these benchmarks as they test a variety of reasoning and general knowle
 ## Details and logs
 You can find:
 - detailed numerical results in the `results` Hugging Face dataset: https://huggingface.co/datasets/{RESULTS_REPO}
-- details on the input/outputs for the models in the `details` of each model, that you can access by clicking the 📄 emoji after the model name
 - community queries and running status in the `requests` Hugging Face dataset: https://huggingface.co/datasets/{QUEUE_REPO}
 ## Reproducibility
@@ -140,10 +140,10 @@ How can I report an evaluation failure?
 ## 2) Model results
 What kind of information can I find?
-- *Let's imagine you are interested in the Yi-34B results. You have access to 3 different information categories:*
       - *The [request file](https://huggingface.co/datasets/{QUEUE_REPO}/blob/main/01-ai/Yi-34B_eval_request_False_bfloat16_Original.json): it gives you information about the status of the evaluation*
       - *The [aggregated results folder](https://huggingface.co/datasets/{RESULTS_REPO}/tree/main/01-ai/Yi-34B): it gives you aggregated scores, per experimental run*
-      - *The [details dataset](https://huggingface.co/datasets/{RESULTS_REPO}/tree/main/01-ai/Yi-34B): it gives you the full details (scores and examples for each task and a given model)*
 Why do models appear several times in the leaderboard?

 from src.display.utils import ModelType
 from src.display.utils import Tasks
+from src.envs import REPO_ID, QUEUE_REPO, RESULTS_REPO, PATH_TO_COLLECTION, LEADERBOARD_NAME, TRUST_REMOTE_CODE, TASK_CONFIG, RAW_RESULTS_REPO
 LM_EVAL_URL = "https://github.com/eduagarcia/lm-evaluation-harness-pt"
 ## Details and logs
 You can find:
 - detailed numerical results in the `results` Hugging Face dataset: https://huggingface.co/datasets/{RESULTS_REPO}
+{"- details on the input/outputs for the models in the `details` of each model, that you can access by clicking the 📄 emoji after the model name" if RAW_RESULTS_REPO is not None else ""}
 - community queries and running status in the `requests` Hugging Face dataset: https://huggingface.co/datasets/{QUEUE_REPO}
 ## Reproducibility
 ## 2) Model results
 What kind of information can I find?
+- *Let's imagine you are interested in the Yi-34B results. You have access to {"3" if RAW_RESULTS_REPO is not None else "2"} different information categories:*
       - *The [request file](https://huggingface.co/datasets/{QUEUE_REPO}/blob/main/01-ai/Yi-34B_eval_request_False_bfloat16_Original.json): it gives you information about the status of the evaluation*
       - *The [aggregated results folder](https://huggingface.co/datasets/{RESULTS_REPO}/tree/main/01-ai/Yi-34B): it gives you aggregated scores, per experimental run*
+      {"- *The [details dataset](https://huggingface.co/datasets/{RAW_RESULTS_REPO}/tree/main/01-ai/Yi-34B): it gives you the full details (scores and examples for each task and a given model)*" if RAW_RESULTS_REPO is not None else ""}
 Why do models appear several times in the leaderboard?