File size: 5,608 Bytes
b978278 e2f4b59 b978278 d9ca473 b978278 d9ca473 b978278 a5cc7ee b978278 d9ca473 b978278 d9ca473 b978278 d9ca473 b978278 d9ca473 b978278 e2f4b59 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
---
base_model:
- arcee-ai/Virtuoso-Small
- sometimesanotion/Qwen2.5-14B-Vimarckoso-v3-slerp
library_name: transformers
tags:
- mergekit
- merge
license: apache-2.0
language:
- en
metrics:
- accuracy
- code_eval
pipeline_tag: text-generation
---
Vimarckoso is a component of Lamarck with a recipe based on [CultriX/Qwen2.5-14B-Wernicke](https://huggingface.co/CultriX/Qwen2.5-14B-Wernicke). I set out to fix the initial version's instruction following without any great loss to reasoning. The results have been surprisingly good; model mergers are now building atop very strong finetunes!
As of this writing, with [open-llm-leaderboard](https://huggingface.co/open-llm-leaderboard) catching up on rankings, Vimarckoso v3 should join Arcee AI's [Virtuoso-Small](https://huggingface.co/arcee-ai/Virtuoso-Small), Sthenno's [miscii-14b-1225](https://huggingface.co/sthenno-com/miscii-14b-1225) and Cultrix's [Qwen2.5-14B-Brocav3](https://huggingface.co/CultriX/Qwen2.5-14B-Brocav3) at the top of the 14B parameter LLM category on this site. As the recipe below will show, their models contribute strongly to Virmarckoso - CultriX's through a strong influence on Lamarck v0.3. Congratulations to everyone whose work went into this!
![Vimarckoso-v3.png](https://huggingface.co/sometimesanotion/Qwen2.5-14B-Vimarckoso-v3/resolve/main/Vimarckoso-v3.png)
---
### Configuration
The following YAML configuration was used to produce this model:
```yaml
name: Qwenvergence-14B-v6-Prose-model_stock
merge_method: model_stock
base_model: Qwen/Qwen2.5-14B
tokenizer_source: huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
parameters:
int8_mask: true
normalize: true
rescale: false
models:
- model: arcee-ai/Virtuoso-Small
- model: sometimesanotion/Lamarck-14B-v0.3
- model: EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
- model: allura-org/TQ2.5-14B-Sugarquill-v1
- model: oxyapi/oxy-1-small
- model: v000000/Qwen2.5-Lumen-14B
- model: sthenno-com/miscii-14b-1225
- model: sthenno-com/miscii-14b-1225
- model: underwoods/medius-erebus-magnum-14b
- model: huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
dtype: float32
out_dtype: bfloat16
---
# Nifty TIES to achieve LoRA compatibility with Qwenvergence models
---
name: Qwenvergence-14B-v6-Prose
merge_method: ties
base_model: Qwen/Qwen2.5-14B
tokenizer_source: base
parameters:
density: 1.00
weight: 1.00
int8_mask: true
normalize: true
rescale: false
dtype: float32
out_dtype: bfloat16
models:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose-slerp
parameters:
density: 1.00
weight: 1.00
---
name: Qwentinuum-14B-v6-Prose-slerp
merge_method: slerp
base_model: sometimesanotion/Qwenvergence-14B-v6-Prose
tokenizer_source: sometimesanotion/Qwenvergence-14B-v6-Prose
dtype: bfloat16
out_dtype: bfloat16
parameters:
int8_mask: true
normalize: true
rescale: false
parameters:
t:
- value: 0.40
slices:
- sources:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose
layer_range: [ 0, 8 ]
- model: sometimesanotion/Qwentinuum-14B-v6
layer_range: [ 0, 8 ]
- sources:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose
layer_range: [ 8, 16 ]
- model: sometimesanotion/Qwentinuum-14B-v6
layer_range: [ 8, 16 ]
- sources:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose
layer_range: [ 16, 24 ]
- model: sometimesanotion/Qwentinuum-14B-v6
layer_range: [ 16, 24 ]
- sources:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose
layer_range: [ 24, 32 ]
- model: sometimesanotion/Qwentinuum-14B-v6
layer_range: [ 24, 32 ]
- sources:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose
layer_range: [ 32, 40 ]
- model: sometimesanotion/Qwentinuum-14B-v6
layer_range: [ 32, 40 ]
- sources:
- model: sometimesanotion/Qwenvergence-14B-v6-Prose
layer_range: [ 40, 48 ]
- model: sometimesanotion/Qwentinuum-14B-v6
layer_range: [ 40, 48 ]
---
name: Qwen2.5-14B-Vimarckoso-v3-slerp
merge_method: slerp
base_model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3-model_stock
tokenizer_source: base
dtype: float32
out_dtype: bfloat16
parameters:
t:
- value: 0.20
slices:
- sources:
- model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3-model_stock
layer_range: [ 0, 48 ]
- model: sometimesanotion/Qwentinuum-14B-v6-Prose+sometimesanotion/Qwenvergence-Abliterate-256
layer_range: [ 0, 48 ]
---
name: Qwen2.5-14B-Vimarckoso-v3-slerp
merge_method: slerp
base_model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3-model_stock
tokenizer_source: base
dtype: float32
out_dtype: bfloat16
parameters:
t:
- value: 0.20
slices:
- sources:
- model: sometimesanotion/Qwen2.5-14B-Vimarckoso-v3-model_stock
layer_range: [ 0, 48 ]
- model: sometimesanotion/Qwentinuum-14B-v6-Prose+sometimesanotion/Qwenvergence-Abliterate-256
layer_range: [ 0, 48 ]
``` |