base_model:
- arcee-ai/Virtuoso-Small
- rombodawg/Rombos-LLM-V2.6-Qwen-14b
- sometimesanotion/Qwentinuum-14B-v013
- sometimesanotion/Lamarck-14B-v0.3
- EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
- allura-org/TQ2.5-14B-Sugarquill-v1
- oxyapi/oxy-1-small
- v000000/Qwen2.5-Lumen-14B
- sthenno-com/miscii-14b-1225
- underwoods/medius-erebus-magnum-14b
- huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
library_name: transformers
tags:
- mergekit
- merge
license: apache-2.0
language:
- en
metrics:
- accuracy
- code_eval
pipeline_tag: text-generation
Vimarckoso is a component of Lamarck with a recipe based on CultriX/Qwen2.5-14B-Wernicke. I set out to fix the initial version's instruction following without any great loss to reasoning. The results have been surprisingly good; model mergers are now building atop very strong finetunes!
As of this writing, with open-llm-leaderboard catching up on rankings, Vimarckoso v3 should join Arcee AI's Virtuoso-Small, Sthenno's miscii-14b-1225 and Cultrix's Qwen2.5-14B-Brocav3 at the top of the 14B parameter LLM category on this site. As the recipe below will show, their models contribute strongly to Virmarckoso - CultriX's through a strong influence on Lamarck v0.3. Congratulations to everyone whose work went into this!
Configuration
The following YAML configuration was used to produce this model:
name: Qwenvergence-14B-v6-Prose-model_stock
merge_method: model_stock
base_model: Qwen/Qwen2.5-14B
tokenizer_source: huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
parameters:
int8_mask: true
normalize: true
rescale: false
models:
- model: arcee-ai/Virtuoso-Small
- model: sometimesanotion/Lamarck-14B-v0.3
- model: EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
- model: allura-org/TQ2.5-14B-Sugarquill-v1
- model: oxyapi/oxy-1-small
- model: v000000/Qwen2.5-Lumen-14B
- model: sthenno-com/miscii-14b-1225
- model: sthenno-com/miscii-14b-1225
- model: underwoods/medius-erebus-magnum-14b
- model: huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
dtype: float32
out_dtype: bfloat16
---
# Nifty TIES to achieve LoRA compatibility with Qwenvergence models
---
name: Qwenvergence-14B-v6-Prose
merge_method: ties
base_model: Qwen/Qwen2.5-14B
tokenizer_source: base
parameters:
density: 1.00
weight: 1.00
int8_mask: true
normalize: true
rescale: false
dtype: float32
out_dtype: bfloat16
models:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose-slerp
parameters:
density: 1.00
weight: 1.00
---
name: Qwentinuum-14B-v6-Prose-slerp
merge_method: slerp
base_model: sometimesanotion/Qwenvergence-14B-v6-Prose
tokenizer_source: sometimesanotion/Qwenvergence-14B-v6-Prose
dtype: bfloat16
out_dtype: bfloat16
parameters:
int8_mask: true
normalize: true
rescale: false
parameters:
t:
- value: 0.40
slices:
- sources:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose
layer_range: [ 0, 8 ]
- model: sometimesanotion/Qwentinuum-14B-v6
layer_range: [ 0, 8 ]
- sources:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose
layer_range: [ 8, 16 ]
- model: sometimesanotion/Qwentinuum-14B-v6
layer_range: [ 8, 16 ]
- sources:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose
layer_range: [ 16, 24 ]
- model: sometimesanotion/Qwentinuum-14B-v6
layer_range: [ 16, 24 ]
- sources:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose
layer_range: [ 24, 32 ]
- model: sometimesanotion/Qwentinuum-14B-v6
layer_range: [ 24, 32 ]
- sources:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose
layer_range: [ 32, 40 ]
- model: sometimesanotion/Qwentinuum-14B-v6
layer_range: [ 32, 40 ]
- sources:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose
layer_range: [ 40, 48 ]
- model: sometimesanotion/Qwentinuum-14B-v6
layer_range: [ 40, 48 ]
---
name: Qwen2.5-14B-Vimarckoso-v3-slerp
merge_method: slerp
base_model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3-model_stock
tokenizer_source: base
dtype: float32
out_dtype: bfloat16
parameters:
t:
- value: 0.20
slices:
- sources:
- model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3-model_stock
layer_range: [ 0, 48 ]
- model: sometimesanotion/Qwentinuum-14B-v6-Prose+sometimesanotion/Qwenvergence-Abliterate-256
layer_range: [ 0, 48 ]
---
name: Qwen2.5-14B-Vimarckoso-v3-slerp
merge_method: slerp
base_model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3-model_stock
tokenizer_source: base
dtype: float32
out_dtype: bfloat16
parameters:
t:
- value: 0.20
slices:
- sources:
- model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3-model_stock
layer_range: [ 0, 48 ]
- model: sometimesanotion/Qwentinuum-14B-v6-Prose+sometimesanotion/Qwenvergence-Abliterate-256
layer_range: [ 0, 48 ]