---
base_model:
- meta-llama/Meta-Llama-3.1-70B-Instruct
- mattshumer/Reflection-Llama-3.1-70B
library_name: transformers
tags:
- mergekit
- merge

---
# merge

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

## Merge Details
### Merge Method

This model was merged using the SLERP merge method.

### Models Merged

The following models were included in the merge:
* [meta-llama/Meta-Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct)
* [mattshumer/Reflection-Llama-3.1-70B](https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
merge_method: slerp  # Define the merge method at the top level

slices:
  - sources:
      - model: mattshumer/Reflection-Llama-3.1-70B
        layer_range:
          - 0
          - 40   # Adjust layer range
      - model: meta-llama/Meta-Llama-3.1-70B-Instruct
        layer_range:
          - 0
          - 40
    base_model: mattshumer/Reflection-Llama-3.1-70B  # Define the base model at the slice level

parameters:
  t:
    - filter: self_attn
      value:
        - 0.1  # Modify weights for self attention
        - 0.5
        - 0.4
        - 0.8
        - 1
    - filter: mlp
      value:
        - 0.9  # Modify weights for MLP layers
        - 0.6
        - 0.7
        - 0.4
        - 0.2
    - value: 0.7  # General merge weight

dtype: bfloat16  # Keep for TPU efficiency


```