File size: 3,095 Bytes
ffff4b1
 
 
 
06c2ccb
00ab3fc
 
 
 
 
 
 
 
 
 
d246b8d
00ab3fc
 
1680410
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
00ab3fc
 
 
1680410
 
4f72180
00ab3fc
 
4f72180
 
00ab3fc
 
 
 
 
4f72180
 
 
 
00ab3fc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ffff4b1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
license: mit
language:
- en
pipeline_tag: text-generation
---

Imatrix compressions of FP Merge of "D_AU-Orac-13B-Tiefighter-slerp".

"Imatrix Plus" is an upgraded form of Imatrix which using full precision for specific parts of the compression. 
As a result all compressions will be slightly larger in size than standard 13B compressions.

This method results in a higher quality model, especially at lower compressions.
This method is applied across all compressions from IQ1 to Q8.

Even IQ1_S - the most compressed verison - works well, however IQ4/Q4 are suggested as minimums for quality.
Highest quality will be Q6/Q8.

How big a difference is this merge?

Orginal Tiefighter IQ1_S (with imatrix enhancements) tested at a perplexity of: 
PPL = 17.2589 +/- 0.12466*

Tiefighter Orca 2 IQ1_S (with imatrix enhancements) tested at a perplexity of: 
PPL = 12.6985 +/- 0.09106*

Note that LOWER perplexity is better.

* Tested using llamacpp, perplexity.exe with wiki.raw. 

In addition the Imatrix file used to "fix" the compressed files post compression resulted in 
over 2 whole points lower perplexity at IQ1_S vs some of the other "Imatrix" files currently in use.

Orginal Tiefighter IQ1_S (with imatrix enhancements) tested with a different "Imatrix" repair file at a perplexity of: 
PPL = 19.6355 +/- 0.14435

Likewise the merge itself affected perplexity too.

This merge was an experiment to test already established Roleplay, Fiction and Story 
generation of "Tiefighter" with a some of "Orca 2"'s qualities.

Additional merge experiements are in progress.

For Imatrix plus this was a test of high precision in specific areas of the model leading to a slightly larger compressed file.
In addition the Imatrix process itself used a larger "calibration" file than standard to further enhance quality.

The process added appoximately 310 MB to each compressed file.

A blank or standard Alpaca Template for text generation will work.
Currently "CHATML" is untested.

Context length: 4096.

Please see the orginal model card for specific details of use, additional credits and tips:

[KoboldAI/LLaMA2-13B-Tiefighter](https://huggingface.co/KoboldAI/LLaMA2-13B-Tiefighter)

# merge

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

## Merge Details
### Merge Method

This model was merged using the SLERP merge method.

### Models Merged

The following models were included in the merge:
* [microsoft/Orca-2-13b](https://huggingface.co/microsoft/Orca-2-13b)
* [KoboldAI/LLaMA2-13B-Tiefighter](https://huggingface.co/KoboldAI/LLaMA2-13B-Tiefighter)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
slices:
  - sources:
      - model: KoboldAI/LLaMA2-13B-Tiefighter
        layer_range: [0, 40]
      - model: microsoft/Orca-2-13b
        layer_range: [0, 40]
merge_method: slerp
base_model: microsoft/Orca-2-13b
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16

```