Edit model card

final_merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the DARE TIES merge method using ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252 as a base.

Models Merged

The following models were included in the merge:

  • ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
  • ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997

Configuration

The following YAML configuration was used to produce this model:

base_model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
dtype: bfloat16
merge_method: dare_ties
parameters:
  int8_mask: 1.0
  normalize: 1.0
slices:
- sources:
  - layer_range: [0, 4]
    model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
    parameters:
      density: 0.863485562098192
      weight: 0.22651847020495885
  - layer_range: [0, 4]
    model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
    parameters:
      density: 0.9343953420777168
      weight: 0.5036150562646258
  - layer_range: [0, 4]
    model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
    parameters:
      density: 1.0
      weight: 0.6451005324417585
- sources:
  - layer_range: [4, 8]
    model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
    parameters:
      density: 0.9846266882538002
      weight: 0.5639921695621852
  - layer_range: [4, 8]
    model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
    parameters:
      density: 1.0
      weight: 0.3231299604274662
  - layer_range: [4, 8]
    model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
    parameters:
      density: 0.9908955898534834
      weight: 0.21486915206711796
- sources:
  - layer_range: [8, 12]
    model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
    parameters:
      density: 0.9065299264285266
      weight: 0.2987555834921648
  - layer_range: [8, 12]
    model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
    parameters:
      density: 0.8840782503058148
      weight: 0.26619854603379545
  - layer_range: [8, 12]
    model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
    parameters:
      density: 0.9914153096559333
      weight: 0.4573592950405189
- sources:
  - layer_range: [12, 16]
    model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
    parameters:
      density: 0.9740298213855892
      weight: 0.48137164129667176
  - layer_range: [12, 16]
    model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
    parameters:
      density: 1.0
      weight: 0.27412584703978277
  - layer_range: [12, 16]
    model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
    parameters:
      density: 0.8407412390278275
      weight: 0.3182141906839257
- sources:
  - layer_range: [16, 20]
    model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
    parameters:
      density: 1.0
      weight: 0.2240504757935422
  - layer_range: [16, 20]
    model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
    parameters:
      density: 1.0
      weight: 0.23938850503773312
  - layer_range: [16, 20]
    model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
    parameters:
      density: 0.9687795057288319
      weight: 0.5987730759861593
- sources:
  - layer_range: [20, 24]
    model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
    parameters:
      density: 1.0
      weight: 0.09945022964618122
  - layer_range: [20, 24]
    model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
    parameters:
      density: 1.0
      weight: 0.26835539762495914
  - layer_range: [20, 24]
    model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
    parameters:
      density: 0.8139356897740962
      weight: 0.4942452603808056
- sources:
  - layer_range: [24, 28]
    model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
    parameters:
      density: 1.0
      weight: 0.20318580465269015
  - layer_range: [24, 28]
    model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
    parameters:
      density: 1.0
      weight: 0.16861512537170825
  - layer_range: [24, 28]
    model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
    parameters:
      density: 1.0
      weight: 0.15118597877918583
- sources:
  - layer_range: [28, 32]
    model: ../evol_merge_storage/input_models/Llama-3.1-Swallow-8B-v0.2_4249862252
    parameters:
      density: 0.7988559962120717
      weight: 0.34008425117612984
  - layer_range: [28, 32]
    model: ../evol_merge_storage/input_models/llama-3-chinese-8b_120379959
    parameters:
      density: 1.0
      weight: 0.2824977970939407
  - layer_range: [28, 32]
    model: ../evol_merge_storage/input_models/Llama-3-ELYZA-JP-8B_2371007997
    parameters:
      density: 0.7131873401997189
      weight: 0.5228166170045327
tokenizer_source: base
Downloads last month
19
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.