bigcode-models-leaderboard

Running

loubnabnl HF staff commited on Nov 15, 2023

Commit

947eb06

•

1 Parent(s): 08e5a25

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -224,7 +224,7 @@ with demo:
                     - Win Rate represents how often a model outperforms other models in each language, averaged across all languages.
                     - The scores of instruction-tuned models might be significantly higher on humaneval-python than other languages. We use the instruction format of HumanEval. For other languages, we use base MultiPL-E prompts.
                     - For more details check the 📝 About section.
-                    - Models with a 🔴 symbol represent external evaluation results submission, this means that we didn't verify the results, you can find the author's submission under `Submission PR` field.
                     """,
                         elem_classes="markdown-text",
                     )

                     - Win Rate represents how often a model outperforms other models in each language, averaged across all languages.
                     - The scores of instruction-tuned models might be significantly higher on humaneval-python than other languages. We use the instruction format of HumanEval. For other languages, we use base MultiPL-E prompts.
                     - For more details check the 📝 About section.
+                    - Models with a 🔴 symbol represent external evaluation submission, this means that we didn't verify the results, you can find the author's submission under `Submission PR` field from `See All Columns` tab.
                     """,
                         elem_classes="markdown-text",
                     )