TheDrummer commited on
Commit
cafb900
1 Parent(s): b653a1f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +85 -18
README.md CHANGED
@@ -1,28 +1,17 @@
1
  ---
2
- base_model: []
3
- library_name: transformers
4
- tags:
5
- - mergekit
6
- - merge
7
-
8
  ---
9
- # 100b
10
-
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
12
-
13
- ## Merge Details
14
- ### Merge Method
15
 
16
- This model was merged using the passthrough merge method.
17
 
18
- ### Models Merged
19
 
20
- The following models were included in the merge:
21
- * ./largestral
22
 
23
- ### Configuration
24
 
25
- The following YAML configuration was used to produce this model:
 
26
 
27
  ```yaml
28
  slices:
@@ -35,3 +24,81 @@ slices:
35
  merge_method: passthrough
36
  dtype: bfloat16
37
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: other
 
 
 
 
 
3
  ---
 
 
 
 
 
 
4
 
5
+ # Lazarus 2407 100B 🌖
6
 
7
+ > To the brave men and women who gave their lives so we could begin again.
8
 
9
+ ## A pruned version of Mistral Large 2407 123B
 
10
 
11
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/99LemcIuW0c2qWac-qDg8.png)
12
 
13
+ - Theory: https://arxiv.org/abs/2403.17887
14
+ - Practice: https://github.com/arcee-ai/PruneMe
15
 
16
  ```yaml
17
  slices:
 
24
  merge_method: passthrough
25
  dtype: bfloat16
26
  ```
27
+
28
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/gNsvay6NpaVKazg-nYSoL.png)
29
+
30
+ ```
31
+ block_start,block_end,average_distance
32
+ 1,17,0.42018115234375
33
+ 2,18,0.40477392578125
34
+ 3,19,0.3955498046875
35
+ 4,20,0.39202392578125
36
+ 5,21,0.38614453125
37
+ 6,22,0.380365234375
38
+ 7,23,0.37550830078125
39
+ 8,24,0.36995361328125
40
+ 9,25,0.36802099609375
41
+ 10,26,0.36337158203125
42
+ 11,27,0.3570400390625
43
+ 12,28,0.35895654296875
44
+ 13,29,0.360716796875
45
+ 14,30,0.36415234375
46
+ 15,31,0.3656328125
47
+ 16,32,0.3657451171875
48
+ 17,33,0.36321337890625
49
+ 18,34,0.36210791015625
50
+ 19,35,0.3613828125
51
+ 20,36,0.36101806640625
52
+ 21,37,0.36395361328125
53
+ 22,38,0.36294921875
54
+ 23,39,0.36490673828125
55
+ 24,40,0.36392578125
56
+ 25,41,0.3616650390625
57
+ 26,42,0.360994140625
58
+ 27,43,0.357458984375
59
+ 28,44,0.35728759765625
60
+ 29,45,0.35514013671875
61
+ 30,46,0.35244140625
62
+ 31,47,0.34866162109375
63
+ 32,48,0.3489375
64
+ 33,49,0.3480888671875
65
+ 34,50,0.34322607421875
66
+ 35,51,0.3366806640625
67
+ 36,52,0.331208984375
68
+ 37,53,0.31834423828125
69
+ 38,54,0.30483203125
70
+ 39,55,0.29027587890625
71
+ 40,56,0.2789296875
72
+ 41,57,0.263868408203125
73
+ 42,58,0.25043017578125
74
+ 43,59,0.23816162109375
75
+ 44,60,0.223565185546875
76
+ 45,61,0.216619140625
77
+ 46,62,0.212794189453125
78
+ 47,63,0.205733642578125
79
+ 48,64,0.200558837890625
80
+ 49,65,0.196917236328125
81
+ 50,66,0.1951201171875
82
+ 51,67,0.192659423828125
83
+ 52,68,0.191610595703125
84
+ 53,69,0.1910341796875
85
+ 54,70,0.190966064453125
86
+ 55,71,0.1920712890625
87
+ 56,72,0.1935439453125
88
+ 57,73,0.1973837890625
89
+ 58,74,0.200970703125
90
+ 59,75,0.205339111328125
91
+ 60,76,0.216475341796875
92
+ 61,77,0.218605224609375
93
+ 62,78,0.22363916015625
94
+ 63,79,0.229986083984375
95
+ 64,80,0.23776171875
96
+ 65,81,0.24455322265625
97
+ 66,82,0.2542783203125
98
+ 67,83,0.2635498046875
99
+ 68,84,0.27629443359375
100
+ 69,85,0.29324267578125
101
+ 70,86,0.32201220703125
102
+ 71,87,0.36908154296875
103
+ 72,88,0.4007587890625
104
+ ```