--- language: - en license: apache-2.0 library_name: transformers tags: - mergekit - merge base_model: - sometimesanotion/Qwen2.5-14B-Vimarckoso-v3 - sometimesanotion/Lamarck-14B-v0.3 - sometimesanotion/Qwenvergence-14B-v3-Prose - Krystalan/DRT-o1-14B - underwoods/medius-erebus-magnum-14b - sometimesanotion/Abliterate-Qwenvergence metrics: - accuracy pipeline_tag: text-generation --- ![Lamarck.webp](https://huggingface.co/sometimesanotion/Lamarck-14B-v0.6-rc4/resolve/main/Lamarck.webp) --- Lamarck 14B v0.6: A generalist merge focused on multi-step reasoning, prose, multi-language ability, and code. It is based on components that have punched above their weight in the 14 billion parameter class. Previous releases were based on a SLERP merge of model_stock->della branches focused on reasoning and prose. The prose branch got surprisingly good at reasoning, and the reasoning branch being the base for IFEVAL became an all-around generalist. Some of you have already downloaded the reasoning branch, released as [sometimesanotion/Qwen2.5-14B-Vimarckoso-v3](https://huggingface.co/sometimesanotion/Qwen2.5-14B-Vimarckoso-v3). Lamarck 0.6 aims to build upon Vimarckoso v3's all-around strength with strong buffs to prose and translation quality, and strong reasoning for its class. Updates to come as leaderboards become available to evaluate it in-depth. ## Merge Details This model was made in two branches: a della_linear merge, and a sequence of model_stock and then breadcrumbs SLERP-merged below. ### Models Merged The model_stock, breadcrumbs, and della_linear all use the following models: [sometimesanotion/Qwen2.5-14B-Vimarckoso-v3](https://huggingface.co/sometimesanotion/Qwen2.5-14B-Vimarckoso-v3) [sometimesanotion/Lamarck-14B-v0.3](https://huggingface.co/sometimesanotion/Lamarck-14B-v0.3) [sometimesanotion/Qwenvergence-14B-v3-Prose](https://huggingface.co/sometimesanotion/Qwenvergence-14B-v3-Prose) - a model_stock merge of multiple prose-oriented models which posts surprisingly high MATH, GPQA, and MUSR scores. [Krystalan/DRT-o1-14B](https://huggingface.co/Krystalan/DRT-o1-14B) - A particularly interesting model which applies extra reasoning to language translation. Check out their fascinating research paper at [arxiv.org/abs/2412.17498](https://arxiv.org/abs/2412.17498). [underwoods/medius-erebus-magnum-14b](https://huggingface.co/underwoods/medius-erebus-magnum-14b) [sometimesanotion/Abliterate-Qwenvergence](https://huggingface.co/sometimesanotion/Abliterate-Qwenvergence) - A custom version of [huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2](https://huggingface.co/huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2) ### Configuration This model was made in two branches: a della_linear merge, and a sequence of model_stock and then breadcrumbs SLERP-merged below. ```yaml name: Lamarck-14B-v0.6-rc4 merge_method: slerp base_model: sometimesanotion/lamarck-14b-converge-della-linear tokenizer_source: base dtype: float32 out_dtype: bfloat16 parameters: int8_mask: true normalize: true rescale: false parameters: t: - value: 0.30 slices: - sources: - model: sometimesanotion/lamarck-14b-converge-della-linear layer_range: [ 0, 8 ] - model: sometimesanotion/lamarck-14b-converge-breadcrumbs layer_range: [ 0, 8 ] - sources: - model: sometimesanotion/lamarck-14b-converge-della-linear layer_range: [ 8, 16 ] - model: sometimesanotion/lamarck-14b-converge-breadcrumbs layer_range: [ 8, 16 ] - sources: - model: sometimesanotion/lamarck-14b-converge-della-linear layer_range: [ 16, 24 ] - model: sometimesanotion/lamarck-14b-converge-breadcrumbs layer_range: [ 16, 24 ] - sources: - model: sometimesanotion/lamarck-14b-converge-della-linear layer_range: [ 24, 32 ] - model: sometimesanotion/lamarck-14b-converge-breadcrumbs layer_range: [ 24, 32 ] - sources: - model: sometimesanotion/lamarck-14b-converge-della-linear layer_range: [ 32, 40 ] - model: sometimesanotion/lamarck-14b-converge-breadcrumbs layer_range: [ 32, 40 ] - sources: - model: sometimesanotion/lamarck-14b-converge-della-linear layer_range: [ 40, 48 ] - model: sometimesanotion/lamarck-14b-converge-breadcrumbs layer_range: [ 40, 48 ] ```