final_merge
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the DARE TIES merge method using ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252 as a base.
Models Merged
The following models were included in the merge:
- ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
- ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
Configuration
The following YAML configuration was used to produce this model:
base_model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
dtype: bfloat16
merge_method: dare_ties
parameters:
int8_mask: 1.0
normalize: 1.0
slices:
- sources:
- layer_range: [0, 4]
model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
parameters:
density: 0.863485562098192
weight: 0.22651847020495885
- layer_range: [0, 4]
model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
parameters:
density: 0.9343953420777168
weight: 0.5036150562646258
- layer_range: [0, 4]
model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
parameters:
density: 1.0
weight: 0.6451005324417585
- sources:
- layer_range: [4, 8]
model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
parameters:
density: 0.9846266882538002
weight: 0.5639921695621852
- layer_range: [4, 8]
model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
parameters:
density: 1.0
weight: 0.3231299604274662
- layer_range: [4, 8]
model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
parameters:
density: 0.9908955898534834
weight: 0.21486915206711796
- sources:
- layer_range: [8, 12]
model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
parameters:
density: 0.9065299264285266
weight: 0.2987555834921648
- layer_range: [8, 12]
model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
parameters:
density: 0.8840782503058148
weight: 0.26619854603379545
- layer_range: [8, 12]
model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
parameters:
density: 0.9914153096559333
weight: 0.4573592950405189
- sources:
- layer_range: [12, 16]
model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
parameters:
density: 0.9740298213855892
weight: 0.48137164129667176
- layer_range: [12, 16]
model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
parameters:
density: 1.0
weight: 0.27412584703978277
- layer_range: [12, 16]
model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
parameters:
density: 0.8407412390278275
weight: 0.3182141906839257
- sources:
- layer_range: [16, 20]
model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
parameters:
density: 1.0
weight: 0.2240504757935422
- layer_range: [16, 20]
model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
parameters:
density: 1.0
weight: 0.23938850503773312
- layer_range: [16, 20]
model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
parameters:
density: 0.9687795057288319
weight: 0.5987730759861593
- sources:
- layer_range: [20, 24]
model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
parameters:
density: 1.0
weight: 0.09945022964618122
- layer_range: [20, 24]
model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
parameters:
density: 1.0
weight: 0.26835539762495914
- layer_range: [20, 24]
model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
parameters:
density: 0.8139356897740962
weight: 0.4942452603808056
- sources:
- layer_range: [24, 28]
model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
parameters:
density: 1.0
weight: 0.20318580465269015
- layer_range: [24, 28]
model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
parameters:
density: 1.0
weight: 0.16861512537170825
- layer_range: [24, 28]
model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
parameters:
density: 1.0
weight: 0.15118597877918583
- sources:
- layer_range: [28, 32]
model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
parameters:
density: 0.7988559962120717
weight: 0.34008425117612984
- layer_range: [28, 32]
model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
parameters:
density: 1.0
weight: 0.2824977970939407
- layer_range: [28, 32]
model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
parameters:
density: 0.7131873401997189
weight: 0.5228166170045327
tokenizer_source: base
- Downloads last month
- 19
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.