luigi86 commited on
Commit
346c6c7
1 Parent(s): ab7cad1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -16,11 +16,13 @@ Quantized using the [cleaned PIPPA](https://huggingface.co/datasets/royallab/PIP
16
 
17
  - [2.25bpw6h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/2.25bpw6h) (tested and working on a single RTX 3090 24GiB at 16k context length)
18
 
19
- - [2.4bpw6h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/2.4bpw6h) (may not load on 24GiB VRAM machines -- untested!)
20
 
21
  - [3.0bpw8h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/3.0bpw8h)
22
 
23
- - [4.0bpw8h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/4.0bpw8h) (tested and working on two 3090s with Q4 cache at 32k context)
 
 
24
 
25
  - [4.5bpw8h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/4.5bpw8h)
26
 
@@ -29,6 +31,8 @@ Quantized using the [cleaned PIPPA](https://huggingface.co/datasets/royallab/PIP
29
  - [8.0bpw8h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/8.0bpw8h)
30
 
31
 
 
 
32
  Other quants available on request, feel free to ask!
33
 
34
 
 
16
 
17
  - [2.25bpw6h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/2.25bpw6h) (tested and working on a single RTX 3090 24GiB at 16k context length)
18
 
19
+ - [2.4bpw6h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/2.4bpw6h) (may not load on 24GiB VRAM machines!)
20
 
21
  - [3.0bpw8h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/3.0bpw8h)
22
 
23
+ - [4.0bpw8h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/4.0bpw8h) (tested and working on two 3090s at 32k context/cache)
24
+
25
+ - [4.4bpw8h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/4.4bpw8h) (tested and working on two 3090s at 32k context, 64k Q4 cache (for CFG or parallelism) with tabbyAPI)
26
 
27
  - [4.5bpw8h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/4.5bpw8h)
28
 
 
31
  - [8.0bpw8h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/8.0bpw8h)
32
 
33
 
34
+ All tests performed on a headless Linux instance with no active desktop environment to maximize VRAM.
35
+
36
  Other quants available on request, feel free to ask!
37
 
38