DavidAU commited on
Commit
9e9b2f3
1 Parent(s): 5328210

Delete README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -202
README.md DELETED
@@ -1,202 +0,0 @@
1
- ---
2
- base_model: []
3
- library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
- ---
9
- # L3-Steno-Maid-Black-LARGE-s1-36-sxy
10
-
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
-
13
- ## Merge Details
14
- ### Merge Method
15
-
16
- This model was merged using the passthrough merge method.
17
-
18
- ### Models Merged
19
-
20
- The following models were included in the merge:
21
- * G:/7B/L3-Lumimaid-8B-v0.1-OAS
22
- * G:/7B/L3-Jamet-8B-MK.V-Blackroot
23
- * G:/7B/L3-8B-Stheno-v3.2
24
-
25
- ### Configuration
26
-
27
- The following YAML configuration was used to produce this model:
28
-
29
- ```yaml
30
- # 32 layers -> VS 40
31
- #models:
32
- # - model: G:/7B/L3-8B-Stheno-v3.2
33
- # - model: G:/7B/Llama-3-Lumimaid-8B-v0.1-OAS
34
- # - model: G:/7B/L3-Jamet-8B-MK.V-Blackroot
35
- #merge_method: model_stock
36
- #base_model: G:/7B/L3-8B-Stheno-v3.2
37
- #dtype: float32
38
-
39
- slices:
40
- - sources:
41
- - model: G:/7B/L3-8B-Stheno-v3.2
42
- layer_range: [0, 14]
43
- parameters:
44
- scale:
45
- - filter: o_proj
46
- value: 1
47
- - filter: down_proj
48
- value: 1
49
- - value: 1
50
- - sources:
51
- - model: G:/7B/L3-Lumimaid-8B-v0.1-OAS
52
- layer_range: [8, 20]
53
- parameters:
54
- scale:
55
- - filter: o_proj
56
- value: 1
57
- - filter: down_proj
58
- value: 1
59
- - value: 1
60
- - sources:
61
- - model: G:/7B/L3-Jamet-8B-MK.V-Blackroot
62
- layer_range: [12, 24]
63
- parameters:
64
- scale:
65
- - filter: o_proj
66
- value: 1
67
- - filter: down_proj
68
- value: 1
69
- - value: 1
70
- - sources:
71
- - model: G:/7B/L3-8B-Stheno-v3.2
72
- layer_range: [14, 20]
73
- parameters:
74
- scale:
75
- - filter: o_proj
76
- value: .8
77
- - filter: down_proj
78
- value: .8
79
- - value: .8
80
- - sources:
81
- - model: G:/7B/L3-8B-Stheno-v3.2
82
- layer_range: [20, 25]
83
- parameters:
84
- scale:
85
- - filter: o_proj
86
- value: .8
87
- - filter: down_proj
88
- value: .8
89
- - value: .8
90
- - sources:
91
- - model: G:/7B/L3-8B-Stheno-v3.2
92
- layer_range: [25, 27]
93
- parameters:
94
- scale:
95
- - filter: o_proj
96
- value: .6
97
- - filter: down_proj
98
- value: .6
99
- - value: 1
100
- - sources:
101
- - model: G:/7B/L3-8B-Stheno-v3.2
102
- layer_range: [27, 28]
103
- parameters:
104
- scale:
105
- - filter: o_proj
106
- value: .9
107
- - filter: down_proj
108
- value: .9
109
- - value: 1
110
- - sources:
111
- - model: G:/7B/L3-Lumimaid-8B-v0.1-OAS
112
- layer_range: [20, 25]
113
- parameters:
114
- scale:
115
- - filter: o_proj
116
- value: 1
117
- - filter: down_proj
118
- value: 1
119
- - value: 1
120
- - sources:
121
- - model: G:/7B/L3-Lumimaid-8B-v0.1-OAS
122
- layer_range: [25, 27]
123
- parameters:
124
- scale:
125
- - filter: o_proj
126
- value: .6
127
- - filter: down_proj
128
- value: .6
129
- - value: 1
130
- - sources:
131
- - model: G:/7B/L3-Lumimaid-8B-v0.1-OAS
132
- layer_range: [27, 31]
133
- parameters:
134
- scale:
135
- - filter: o_proj
136
- value: 1
137
- - filter: down_proj
138
- value: 1
139
- - value: 1
140
- - sources:
141
- - model: G:/7B/L3-Jamet-8B-MK.V-Blackroot
142
- layer_range: [24, 31]
143
- parameters:
144
- scale:
145
- - filter: o_proj
146
- value: 1
147
- - filter: down_proj
148
- value: 1
149
- - value: 1
150
- - sources:
151
- - model: G:/7B/L3-Jamet-8B-MK.V-Blackroot
152
- layer_range: [31, 32]
153
- parameters:
154
- scale:
155
- - filter: o_proj
156
- value: 0.3333333333333
157
- - filter: down_proj
158
- value: 0.3333333333333
159
- - value: 0.3333333333333
160
- - sources:
161
- - model: G:/7B/L3-Jamet-8B-MK.V-Blackroot
162
- layer_range: [31, 32]
163
- parameters:
164
- scale:
165
- - filter: o_proj
166
- value: 0.4444444444444
167
- - filter: down_proj
168
- value: 0.4444444444444
169
- - value: 0.4444444444444
170
- - sources:
171
- - model: G:/7B/L3-Jamet-8B-MK.V-Blackroot
172
- layer_range: [31, 32]
173
- parameters:
174
- scale:
175
- - filter: o_proj
176
- value: 0.5555555555555
177
- - filter: down_proj
178
- value: 0.5555555555555
179
- - value: 0.5555555555555
180
- - sources:
181
- - model: G:/7B/L3-Jamet-8B-MK.V-Blackroot
182
- layer_range: [31, 32]
183
- parameters:
184
- scale:
185
- - filter: o_proj
186
- value: 0.6666666666666
187
- - filter: down_proj
188
- value: 0.6666666666666
189
- - value: 0.6666666666666
190
- - sources:
191
- - model: G:/7B/L3-Jamet-8B-MK.V-Blackroot
192
- layer_range: [31, 32]
193
- parameters:
194
- scale:
195
- - filter: o_proj
196
- value: 0.7777777777777
197
- - filter: down_proj
198
- value: 0.7777777777777
199
- - value: 0.8888888888888
200
- merge_method: passthrough
201
- dtype: float16
202
- ```