jukofyork commited on
Commit
ced39c8
1 Parent(s): c87c2c1

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: []
3
+ library_name: transformers
4
+ tags:
5
+ - mergekit
6
+ - merge
7
+
8
+ ---
9
+ # miquplus-xwin-70b
10
+
11
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
+
13
+ ## Merge Details
14
+ ### Merge Method
15
+
16
+ This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method using /home/juk/LLMs/models/huggingface/miqu-1-70b-sf as a base.
17
+
18
+ ### Models Merged
19
+
20
+ The following models were included in the merge:
21
+ * miqu-models/_miquplus-xwin-70b
22
+
23
+ ### Configuration
24
+
25
+ The following YAML configuration was used to produce this model:
26
+
27
+ ```yaml
28
+ base_model:
29
+ model:
30
+ path: /home/juk/LLMs/models/huggingface/miqu-1-70b-sf
31
+ dtype: float16
32
+ merge_method: linear
33
+ slices:
34
+ - sources:
35
+ - layer_range: [0, 80]
36
+ model:
37
+ model:
38
+ path: /home/juk/LLMs/models/huggingface/miqu-1-70b-sf
39
+ parameters:
40
+ weight:
41
+ - filter: v_proj
42
+ value: [1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0]
43
+ - filter: o_proj
44
+ value: [1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0]
45
+ - filter: up_proj
46
+ value: [1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0]
47
+ - filter: gate_proj
48
+ value: [1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0]
49
+ - filter: down_proj
50
+ value: [1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0]
51
+ - value: 1.0
52
+ - layer_range: [0, 80]
53
+ model:
54
+ model:
55
+ path: miqu-models/_miquplus-xwin-70b
56
+ parameters:
57
+ weight:
58
+ - filter: v_proj
59
+ value: [0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0]
60
+ - filter: o_proj
61
+ value: [0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0]
62
+ - filter: up_proj
63
+ value: [0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0]
64
+ - filter: gate_proj
65
+ value: [0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0]
66
+ - filter: down_proj
67
+ value: [0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0]
68
+ - value: 0.0
69
+ tokenizer_source: base
70
+ ```