Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
---
|
2 |
license: other
|
3 |
---
|
4 |
-
5 bit quantization of airoboros 70b 1.4.1, using exllama2.
|
5 |
|
6 |
On 2x4090, 3072 ctx seems to work fine with 21.5,22.5 gpu_split and max_attention_size = 1024 ** 2 instead if 2048 ** 2.
|
7 |
|
|
|
1 |
---
|
2 |
license: other
|
3 |
---
|
4 |
+
5 bit quantization of airoboros 70b 1.4.1 (https://huggingface.co/jondurbin/airoboros-l2-70b-gpt4-1.4.1), using exllama2.
|
5 |
|
6 |
On 2x4090, 3072 ctx seems to work fine with 21.5,22.5 gpu_split and max_attention_size = 1024 ** 2 instead if 2048 ** 2.
|
7 |
|