kennymckormick commited on
Commit
4e15c72
·
1 Parent(s): 2347f30

update README

Browse files
Files changed (1) hide show
  1. lb_info.py +4 -3
lb_info.py CHANGED
@@ -25,8 +25,9 @@ CITATION_BUTTON_TEXT = r"""@misc{2023opencompass,
25
  CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
26
  # CONSTANTS-TEXT
27
  LEADERBORAD_INTRODUCTION = """# OpenVLM Leaderboard
28
- ### Welcome to the OpenVLM Leaderboard! On this leaderboard we share the evaluation results of VLMs obtained by the OpenSource Framework [**VLMEvalKit**](https://github.com/open-compass/VLMEvalKit) 🏆
29
- ### Currently, OpenVLM Leaderboard covers {} different VLMs (including GPT-4v, Gemini, QwenVLPlus, LLaVA, etc.) and {} different multi-modal benchmarks.
 
30
 
31
  This leaderboard was last updated: {}.
32
  """
@@ -131,7 +132,7 @@ LEADERBOARD_MD['COCO_VAL'] = """
131
  """
132
 
133
  LEADERBOARD_MD['ScienceQA_VAL'] = """
134
- # ScienceQA Evaluation Results
135
 
136
  - We benchmark the **image** subset of ScienceQA validation and test set, and report the Top-1 accuracy.
137
  - During evaluation, we use `GPT-3.5-Turbo-0613` as the choice extractor for all VLMs if the choice can not be extracted via heuristic matching. **Zero-shot** inference is adopted.
 
25
  CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
26
  # CONSTANTS-TEXT
27
  LEADERBORAD_INTRODUCTION = """# OpenVLM Leaderboard
28
+ ## Welcome to the OpenVLM Leaderboard! On this leaderboard we share the evaluation results of VLMs obtained by the OpenSource Framework:
29
+ ## [*VLMEvalKit*: A Toolkit for Evaluating Large Vision-Language Models](https://github.com/open-compass/VLMEvalKit) 🏆
30
+ ## Currently, OpenVLM Leaderboard covers {} different VLMs (including GPT-4v, Gemini, QwenVLPlus, LLaVA, etc.) and {} different multi-modal benchmarks.
31
 
32
  This leaderboard was last updated: {}.
33
  """
 
132
  """
133
 
134
  LEADERBOARD_MD['ScienceQA_VAL'] = """
135
+ ## ScienceQA Evaluation Results
136
 
137
  - We benchmark the **image** subset of ScienceQA validation and test set, and report the Top-1 accuracy.
138
  - During evaluation, we use `GPT-3.5-Turbo-0613` as the choice extractor for all VLMs if the choice can not be extracted via heuristic matching. **Zero-shot** inference is adopted.