luigi86
/

magnum-72b-v1-exl2-rpcal

Text Generation

Inference Endpoints

Model card Files Files and versions Community

luigi86 commited on Jul 9

Commit

ab7cad1

•

1 Parent(s): 49fb245

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -14,6 +14,8 @@ tags:
 Quantized using the [cleaned PIPPA](https://huggingface.co/datasets/royallab/PIPPA-cleaned) roleplay dataset.
 - [2.4bpw6h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/2.4bpw6h) (may not load on 24GiB VRAM machines -- untested!)
 - [3.0bpw8h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/3.0bpw8h)

 Quantized using the [cleaned PIPPA](https://huggingface.co/datasets/royallab/PIPPA-cleaned) roleplay dataset.
+- [2.25bpw6h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/2.25bpw6h) (tested and working on a single RTX 3090 24GiB at 16k context length)
 - [2.4bpw6h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/2.4bpw6h) (may not load on 24GiB VRAM machines -- untested!)
 - [3.0bpw8h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/3.0bpw8h)