Information
This is a Exl2 quantized version of MN-LooseCannon-12B-v1
Please refer to the original creator for more information.
Calibration dataset: Exl2 default
Branches:
- main: Measurement files
- 4bpw: 4 bits per weight
- 5bpw: 5 bits per weight
- 6bpw: 6 bits per weight
Notes
- 6bpw is recommended for the best quality to vram usage ratio (assuming you have enough vram).
- Quants greater than 6bpw will not be created because there is no improvement in using them. If you really want them, ask someone else or make them yourself.
Download
With async-hf-downloader: A lightweight and asynchronous huggingface downloader created by me
./async-hf-downloader royallab/MN-LooseCannon-12B-v1-exl2 -r 6bpw -p MN-LooseCannon-12B-v1-exl2-6bpw
With HuggingFace hub (pip install huggingface_hub
)
huggingface-cli download royallab/MN-LooseCannon-12B-v1-exl2 --revision 6bpw --local-dir MN-LooseCannon-12B-v1-exl2-6bpw
Run in TabbyAPI
TabbyAPI is a pure exllamav2 FastAPI server developed by us. You can find TabbyAPI's source code here: https://github.com/theroyallab/TabbyAPI
Inside TabbyAPI's config.yml, set
model_name
toMN-LooseCannon-12B-v1-exl2-6bpw
- You can also use an argument
--model_name MN-LooseCannon-12B-v1-exl2-6bpw
on startup or you can use the/v1/model/load
endpoint
- You can also use an argument
Launch TabbyAPI inside your python env by running
./start.bat
or./start.sh
Donate?
All my infrastructure and cloud expenses are paid out of pocket. If you'd like to donate, you can do so here: https://ko-fi.com/kingbri