Update README.md

9cde01a verified 8 days ago

4.81 kB

	---
	base_model:
	- CultriX/Qwen2.5-14B-MegaMerge-pt1
	- CultriX/Qwen2.5-14B-Wernicke
	- CultriX/Qwen2.5-14B-MergeStock
	library_name: transformers
	tags:
	- mergekit
	- merge
	license: apache-2.0
	language:
	- en
	model-index:
	- name: Qwen2.5-14B-MegaMerge-pt2
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: IFEval (0-Shot)
	type: HuggingFaceH4/ifeval
	args:
	num_few_shot: 0
	metrics:
	- type: inst_level_strict_acc and prompt_level_strict_acc
	value: 52.35
	name: strict accuracy
	source:
	url: >-
	https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=CultriX/Qwen2.5-14B-MegaMerge-pt2
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: BBH (3-Shot)
	type: BBH
	args:
	num_few_shot: 3
	metrics:
	- type: acc_norm
	value: 50.64
	name: normalized accuracy
	source:
	url: >-
	https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=CultriX/Qwen2.5-14B-MegaMerge-pt2
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MATH Lvl 5 (4-Shot)
	type: hendrycks/competition_math
	args:
	num_few_shot: 4
	metrics:
	- type: exact_match
	value: 30.06
	name: exact match
	source:
	url: >-
	https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=CultriX/Qwen2.5-14B-MegaMerge-pt2
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GPQA (0-shot)
	type: Idavidrein/gpqa
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 19.13
	name: acc_norm
	source:
	url: >-
	https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=CultriX/Qwen2.5-14B-MegaMerge-pt2
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MuSR (0-shot)
	type: TAUR-Lab/MuSR
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 18.25
	name: acc_norm
	source:
	url: >-
	https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=CultriX/Qwen2.5-14B-MegaMerge-pt2
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU-PRO (5-shot)
	type: TIGER-Lab/MMLU-Pro
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 49.15
	name: accuracy
	source:
	url: >-
	https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=CultriX/Qwen2.5-14B-MegaMerge-pt2
	name: Open LLM Leaderboard
	metrics:
	- accuracy
	pipeline_tag: text-generation
	---
	# merge

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	## Merge Details
	### Merge Method

	This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [CultriX/Qwen2.5-14B-MegaMerge-pt1](https://huggingface.co/CultriX/Qwen2.5-14B-MegaMerge-pt1) as a base.

	### Models Merged

	The following models were included in the merge:
	* [CultriX/Qwen2.5-14B-Wernicke](https://huggingface.co/CultriX/Qwen2.5-14B-Wernicke)
	* [CultriX/Qwen2.5-14B-MergeStock](https://huggingface.co/CultriX/Qwen2.5-14B-MergeStock)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	# final_dare_ties_merge.yaml

	models:
	- model: CultriX/Qwen2.5-14B-MergeStock
	parameters:
	density: 0.5 # Retain 50% of the most significant parameters
	weight: 0.6 # Emphasize MergeStock's contributions
	- model: CultriX/Qwen2.5-14B-Wernicke
	parameters:
	density: 0.5 # Retain 50% of the most significant parameters
	weight: 0.4 # Incorporate Wernicke's contributions
	merge_method: dare_ties
	base_model: CultriX/Qwen2.5-14B-MegaMerge-pt1
	parameters:
	normalize: true
	int8_mask: true
	dtype: bfloat16
	tokenizer_source: Qwen/Qwen2.5-14B-Instruct

	```
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_CultriX__Qwen2.5-14B-MegaMerge-pt2)
	\| Metric \| Value \|
	\|------------------- \|------:\|
	\| Avg. \| 36.69 \|
	\| IFEval (0-Shot) \| 56.83 \|
	\| BBH (3-Shot) \| 50.91 \|
	\| MATH Lvl 5 (4-Shot)\| 27.34 \|
	\| GPQA (0-shot) \| 17.23 \|
	\| MuSR (0-shot) \| 18.74 \|
	\| MMLU-PRO (5-shot) \| 49.12 \|