Spaces:

opencompass
/

Open_LMM_Reasoning_Leaderboard

Running

App Files Files Community

Open_LMM_Reasoning_Leaderboard / meta_data.py

KennyUTC

update meta data

86d69b4 about 1 month ago

raw

history blame

1.91 kB

	# CONSTANTS-URL
	URL = "http://opencompass.openxlab.space/assets/MathLB.json"
	# CONSTANTS-CITATION
	CITATION_BUTTON_TEXT = r"""\
	@inproceedings{duan2024vlmevalkit,
	title={Vlmevalkit: An open-source toolkit for evaluating large multi-modality models},
	author={Duan, Haodong and Yang, Junming and Qiao, Yuxuan and Fang, Xinyu and Chen, Lin and Liu, Yuan and Dong, Xiaoyi and Zang, Yuhang and Zhang, Pan and Wang, Jiaqi and others},
	booktitle={Proceedings of the 32nd ACM International Conference on Multimedia},
	pages={11198--11201},
	year={2024}
	}
	"""
	CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
	# CONSTANTS-TEXT
	LEADERBORAD_INTRODUCTION = """# Open LMM Reasoning Leaderboard

	This leaderboard aims at providing a comprehensive evaluation of the reasoning capabilities of LMMs.
	Currently, it is a collection of evaluation results on multiple multi-modal mathematical reasoning benchmarks.
	We obtain all evaluation results based on the [VLMEvalKit](https://github.com/open-compass/VLMEvalKit), with the corresponding dataset names:

	1. MathVista_MINI: The Test Mini split of MathVista dataset, around 1000 samples.
	2. MathVision: The Full test set of MathVision, around 3000 samples.
	3. MathVerse_MINI_Vision_Only: The Test Mini split of MathVerse, using the "Vision Only" mode, around 700 samples.
	4. DynaMath: The Full test set of DynaMath, around 5000 samples (501 original questions x 10 variants).

	To suggest new models or benchmarks for this leaderboard, please contact duanhaodong@pjlab.org.cn.
	"""

	# CONSTANTS-FIELDS
	DATASETS_ALL = ['MathVista', 'MathVision', 'MathVerse', 'DynaMath']
	DATASETS_ESS = ['MathVista', 'MathVision', 'MathVerse', 'DynaMath']
	META_FIELDS = ['Method', 'Param (B)', 'Language Model', 'Vision Model', 'OpenSource', 'Verified', 'Org']
	MODEL_SIZE = ['<4B', '4B-10B', '10B-20B', '20B-40B', '>40B', 'Unknown']
	MODEL_TYPE = ['OpenSource', 'API']