Update README.md
Browse files
README.md
CHANGED
@@ -27,11 +27,42 @@ datasets:
|
|
27 |
language:
|
28 |
- en
|
29 |
library_name: transformers
|
|
|
30 |
tags:
|
31 |
- code
|
32 |
- art
|
33 |
---
|
34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
This is [NinjaMouse](https://huggingface.co/trollek/NinjaMouse-2.4B-32L-danube) extended even further. Instead of Cosmopedia I used different coding datasets.
|
36 |
|
37 |
I have learned a lot during this process, and if you got a GPU capable of training your own you should try it. I made some mistakes, like using the pure_bf16 at some point among other things, but the second version will slap the leaderboard for its weight class.
|
|
|
27 |
language:
|
28 |
- en
|
29 |
library_name: transformers
|
30 |
+
inference: false
|
31 |
tags:
|
32 |
- code
|
33 |
- art
|
34 |
---
|
35 |
|
36 |
+
# NinjaMouse-3B-40L-danube-exl2
|
37 |
+
|
38 |
+
Original model: [NinjaMouse-3B-40L-danube](https://huggingface.co/trollek/NinjaMouse-3B-40L-danube)
|
39 |
+
Model creator: [trollek](https://huggingface.co/trollek)
|
40 |
+
|
41 |
+
## Quants
|
42 |
+
|
43 |
+
[4bpw h6](https://huggingface.co/cgus/NinjaMouse-3B-40L-danube-exl2/tree/main)
|
44 |
+
[4.25bpw h6](https://huggingface.co/cgus/NinjaMouse-3B-40L-danube-exl2/tree/4.25bpw-h6)
|
45 |
+
[4.65bpw h6](https://huggingface.co/cgus/NinjaMouse-3B-40L-danube-exl2/tree/4.65bpw-h6)
|
46 |
+
[5bpw h6](https://huggingface.co/cgus/NinjaMouse-3B-40L-danube-exl2/tree/5bpw-h6)
|
47 |
+
[6bpw h6](https://huggingface.co/cgus/NinjaMouse-3B-40L-danube-exl2/tree/6bpw-h6)
|
48 |
+
[8bpw h8](https://huggingface.co/cgus/NinjaMouse-3B-40L-danube-exl2/tree/8bpw-h8)
|
49 |
+
|
50 |
+
## Quantization notes
|
51 |
+
Made with exllamav2 0.0.15 with default dataset. I'm very unsure about this model.
|
52 |
+
For me it breaks past 3000 context, at about 3500 or so. Both with these quants and the creator's GGUF files.
|
53 |
+
At first I thought I had some quantization issues but it's probably just the model itself.
|
54 |
+
|
55 |
+
## How to run
|
56 |
+
This quantization method uses GPU and requires Exllamav2 loader which can be found in following applications:
|
57 |
+
|
58 |
+
[Text Generation Webui](https://github.com/oobabooga/text-generation-webui)
|
59 |
+
|
60 |
+
[KoboldAI](https://github.com/henk717/KoboldAI)
|
61 |
+
|
62 |
+
[ExUI](https://github.com/turboderp/exui)
|
63 |
+
|
64 |
+
# Original model card
|
65 |
+
|
66 |
This is [NinjaMouse](https://huggingface.co/trollek/NinjaMouse-2.4B-32L-danube) extended even further. Instead of Cosmopedia I used different coding datasets.
|
67 |
|
68 |
I have learned a lot during this process, and if you got a GPU capable of training your own you should try it. I made some mistakes, like using the pure_bf16 at some point among other things, but the second version will slap the leaderboard for its weight class.
|