TheDrummer
commited on
Commit
•
cafb900
1
Parent(s):
b653a1f
Update README.md
Browse files
README.md
CHANGED
@@ -1,28 +1,17 @@
|
|
1 |
---
|
2 |
-
|
3 |
-
library_name: transformers
|
4 |
-
tags:
|
5 |
-
- mergekit
|
6 |
-
- merge
|
7 |
-
|
8 |
---
|
9 |
-
# 100b
|
10 |
-
|
11 |
-
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
12 |
-
|
13 |
-
## Merge Details
|
14 |
-
### Merge Method
|
15 |
|
16 |
-
|
17 |
|
18 |
-
|
19 |
|
20 |
-
|
21 |
-
* ./largestral
|
22 |
|
23 |
-
|
24 |
|
25 |
-
|
|
|
26 |
|
27 |
```yaml
|
28 |
slices:
|
@@ -35,3 +24,81 @@ slices:
|
|
35 |
merge_method: passthrough
|
36 |
dtype: bfloat16
|
37 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
license: other
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
|
5 |
+
# Lazarus 2407 100B 🌖
|
6 |
|
7 |
+
> To the brave men and women who gave their lives so we could begin again.
|
8 |
|
9 |
+
## A pruned version of Mistral Large 2407 123B
|
|
|
10 |
|
11 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/99LemcIuW0c2qWac-qDg8.png)
|
12 |
|
13 |
+
- Theory: https://arxiv.org/abs/2403.17887
|
14 |
+
- Practice: https://github.com/arcee-ai/PruneMe
|
15 |
|
16 |
```yaml
|
17 |
slices:
|
|
|
24 |
merge_method: passthrough
|
25 |
dtype: bfloat16
|
26 |
```
|
27 |
+
|
28 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65f2fd1c25b848bd061b5c2e/gNsvay6NpaVKazg-nYSoL.png)
|
29 |
+
|
30 |
+
```
|
31 |
+
block_start,block_end,average_distance
|
32 |
+
1,17,0.42018115234375
|
33 |
+
2,18,0.40477392578125
|
34 |
+
3,19,0.3955498046875
|
35 |
+
4,20,0.39202392578125
|
36 |
+
5,21,0.38614453125
|
37 |
+
6,22,0.380365234375
|
38 |
+
7,23,0.37550830078125
|
39 |
+
8,24,0.36995361328125
|
40 |
+
9,25,0.36802099609375
|
41 |
+
10,26,0.36337158203125
|
42 |
+
11,27,0.3570400390625
|
43 |
+
12,28,0.35895654296875
|
44 |
+
13,29,0.360716796875
|
45 |
+
14,30,0.36415234375
|
46 |
+
15,31,0.3656328125
|
47 |
+
16,32,0.3657451171875
|
48 |
+
17,33,0.36321337890625
|
49 |
+
18,34,0.36210791015625
|
50 |
+
19,35,0.3613828125
|
51 |
+
20,36,0.36101806640625
|
52 |
+
21,37,0.36395361328125
|
53 |
+
22,38,0.36294921875
|
54 |
+
23,39,0.36490673828125
|
55 |
+
24,40,0.36392578125
|
56 |
+
25,41,0.3616650390625
|
57 |
+
26,42,0.360994140625
|
58 |
+
27,43,0.357458984375
|
59 |
+
28,44,0.35728759765625
|
60 |
+
29,45,0.35514013671875
|
61 |
+
30,46,0.35244140625
|
62 |
+
31,47,0.34866162109375
|
63 |
+
32,48,0.3489375
|
64 |
+
33,49,0.3480888671875
|
65 |
+
34,50,0.34322607421875
|
66 |
+
35,51,0.3366806640625
|
67 |
+
36,52,0.331208984375
|
68 |
+
37,53,0.31834423828125
|
69 |
+
38,54,0.30483203125
|
70 |
+
39,55,0.29027587890625
|
71 |
+
40,56,0.2789296875
|
72 |
+
41,57,0.263868408203125
|
73 |
+
42,58,0.25043017578125
|
74 |
+
43,59,0.23816162109375
|
75 |
+
44,60,0.223565185546875
|
76 |
+
45,61,0.216619140625
|
77 |
+
46,62,0.212794189453125
|
78 |
+
47,63,0.205733642578125
|
79 |
+
48,64,0.200558837890625
|
80 |
+
49,65,0.196917236328125
|
81 |
+
50,66,0.1951201171875
|
82 |
+
51,67,0.192659423828125
|
83 |
+
52,68,0.191610595703125
|
84 |
+
53,69,0.1910341796875
|
85 |
+
54,70,0.190966064453125
|
86 |
+
55,71,0.1920712890625
|
87 |
+
56,72,0.1935439453125
|
88 |
+
57,73,0.1973837890625
|
89 |
+
58,74,0.200970703125
|
90 |
+
59,75,0.205339111328125
|
91 |
+
60,76,0.216475341796875
|
92 |
+
61,77,0.218605224609375
|
93 |
+
62,78,0.22363916015625
|
94 |
+
63,79,0.229986083984375
|
95 |
+
64,80,0.23776171875
|
96 |
+
65,81,0.24455322265625
|
97 |
+
66,82,0.2542783203125
|
98 |
+
67,83,0.2635498046875
|
99 |
+
68,84,0.27629443359375
|
100 |
+
69,85,0.29324267578125
|
101 |
+
70,86,0.32201220703125
|
102 |
+
71,87,0.36908154296875
|
103 |
+
72,88,0.4007587890625
|
104 |
+
```
|