llmixer commited on
Commit
1fc93c8
1 Parent(s): 6996f42

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: unknown
3
+ language:
4
+ - en
5
+ pipeline_tag: conversational
6
+ tags:
7
+ - frankenmerge
8
+ - 110b
9
+ ---
10
+ # BigWeave v20 110b
11
+
12
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/65a6db055c58475cf9e6def1/4CbbAN-X7ZWj702JrcCGH.png" width=600>
13
+
14
+ The BigWeave models aim to experimentally identify merge settings for increasing model performance. The version number merely tracks various attempts and is not a quality indicator. Only results demonstrating good performance are retained and shared.
15
+
16
+ # Prompting Format
17
+ Mistral, Vicuna and Alpaca.
18
+
19
+ # Merge process
20
+ This is a merge of 152334H/miqu-1-70b-sf and lizpreciatior/lzlv_70b_fp16_hf. By conducting exl2 measurements, we identify the least important layers of lzlv. These least important layers are extended with layers in-between to create longer series of consecutive layers. These slices are then inserted into miqu.
21
+
22
+ Merge configuration:
23
+ ```
24
+ slices:
25
+ - sources:
26
+ - model: 152334H/miqu-1-70b-sf
27
+ layer_range: [0, 1]
28
+ - model: lizpreciatior/lzlv_70b_fp16_hf
29
+ layer_range: [0, 1]
30
+ parameters:
31
+ weight: 0
32
+ - sources:
33
+ - model: 152334H/miqu-1-70b-sf
34
+ layer_range: [1,26]
35
+ - sources:
36
+ - model: lizpreciatior/lzlv_70b_fp16_hf
37
+ layer_range: [9,44]
38
+ - sources:
39
+ - model: 152334H/miqu-1-70b-sf
40
+ layer_range: [27,52]
41
+ - sources:
42
+ - model: lizpreciatior/lzlv_70b_fp16_hf
43
+ layer_range: [45,60]
44
+ - sources:
45
+ - model: 152334H/miqu-1-70b-sf
46
+ layer_range: [53,79]
47
+ - sources:
48
+ - model: 152334H/miqu-1-70b-sf
49
+ layer_range: [79, 80]
50
+ - model: lizpreciatior/lzlv_70b_fp16_hf
51
+ layer_range: [79, 80]
52
+ parameters:
53
+ weight: 0
54
+ merge_method: linear
55
+ parameters:
56
+ weight: 1.0
57
+ dtype: float16
58
+ tokenizer_source: 152334H/miqu-1-70b-sf
59
+ ```