grimjim commited on
Commit
d56c380
1 Parent(s): 061160f

Update README.md

Browse files

Added link to GGUF quants

Files changed (1) hide show
  1. README.md +48 -45
README.md CHANGED
@@ -1,45 +1,48 @@
1
- ---
2
- base_model:
3
- - grimjim/kukulemon-32K-7B
4
- - grimjim/rogue-enchantress-32k-7B
5
- library_name: transformers
6
- tags:
7
- - mergekit
8
- - merge
9
- license: cc-by-nc-4.0
10
- pipeline_tag: text-generation
11
- ---
12
- # kukulemon-v3-soul_mix-32k-7B
13
-
14
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
15
-
16
- We explore merger at extremely low weight as an alternative to fine-tuning. The additional model was applied at a weight of 10e-5, which was selected to be comparable to a few epochs of training. The low weight also amounts to the additional model being flattened, though technically not sparsified.
17
-
18
- ## Merge Details
19
- ### Merge Method
20
-
21
- This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [grimjim/kukulemon-32K-7B](https://huggingface.co/grimjim/kukulemon-32K-7B) as a base.
22
-
23
- ### Models Merged
24
-
25
- The following model was included in the merge:
26
- * [grimjim/rogue-enchantress-32k-7B](https://huggingface.co/grimjim/rogue-enchantress-32k-7B)
27
-
28
- ### Configuration
29
-
30
- The following YAML configuration was used to produce this model:
31
-
32
- ```yaml
33
- base_model: grimjim/kukulemon-32K-7B
34
- dtype: bfloat16
35
- merge_method: task_arithmetic
36
- slices:
37
- - sources:
38
- - layer_range: [0, 32]
39
- model: grimjim/kukulemon-32K-7B
40
- - layer_range: [0, 32]
41
- model: grimjim/rogue-enchantress-32k-7B
42
- parameters:
43
- weight: 10e-5
44
-
45
- ```
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - grimjim/kukulemon-32K-7B
4
+ - grimjim/rogue-enchantress-32k-7B
5
+ library_name: transformers
6
+ tags:
7
+ - mergekit
8
+ - merge
9
+ license: cc-by-nc-4.0
10
+ pipeline_tag: text-generation
11
+ ---
12
+ # kukulemon-v3-soul_mix-32k-7B
13
+
14
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
15
+
16
+ We explore merger at extremely low weight as an alternative to fine-tuning. The additional model was applied at a weight of 10e-5, which was selected to be comparable to a few epochs of training. The low weight also amounts to the additional model being flattened, though technically not sparsified.
17
+
18
+ - [Full weights](https://huggingface.co/grimjim/kukulemon-v3-soul_mix-32k-7B)
19
+ - [GGUF quants](https://huggingface.co/grimjim/kukulemon-v3-soul_mix-32k-7B-GGUF)
20
+
21
+ ## Merge Details
22
+ ### Merge Method
23
+
24
+ This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [grimjim/kukulemon-32K-7B](https://huggingface.co/grimjim/kukulemon-32K-7B) as a base.
25
+
26
+ ### Models Merged
27
+
28
+ The following model was included in the merge:
29
+ * [grimjim/rogue-enchantress-32k-7B](https://huggingface.co/grimjim/rogue-enchantress-32k-7B)
30
+
31
+ ### Configuration
32
+
33
+ The following YAML configuration was used to produce this model:
34
+
35
+ ```yaml
36
+ base_model: grimjim/kukulemon-32K-7B
37
+ dtype: bfloat16
38
+ merge_method: task_arithmetic
39
+ slices:
40
+ - sources:
41
+ - layer_range: [0, 32]
42
+ model: grimjim/kukulemon-32K-7B
43
+ - layer_range: [0, 32]
44
+ model: grimjim/rogue-enchantress-32k-7B
45
+ parameters:
46
+ weight: 10e-5
47
+
48
+ ```