luigi86
/

magnum-72b-v1-exl2-rpcal

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

luigi86 commited on Jul 9

Commit

871fcf9

•

1 Parent(s): d31f196

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ tags:
 Quantized using the [cleaned PIPPA](https://huggingface.co/datasets/royallab/PIPPA-cleaned) roleplay dataset. Uploading as I didn't see anyone else do this one yet.
-[4.0bpw8h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/4.0bpw8h)
 See [original model](https://huggingface.co/alpindale/magnum-72b-v1) for further details.

 Quantized using the [cleaned PIPPA](https://huggingface.co/datasets/royallab/PIPPA-cleaned) roleplay dataset. Uploading as I didn't see anyone else do this one yet.
+[4.0bpw8h quants](https://huggingface.co/luigi86/magnum-72b-v1-exl2-rpcal/tree/4.0bpw8h) (tested and working on two 3090s with Q4 cache at 32k context)
 See [original model](https://huggingface.co/alpindale/magnum-72b-v1) for further details.