mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-4bit_g64-HQQ
Text Generation
•
Updated
•
16
•
9
4-bit and 2-bit Mixtral models quantized using https://github.com/mobiusml/hqq
Note If you are considering 2-bit instruct model use this one.
Note If you are considering 2-bit base model use this one.
Note If you are considering 2-bit base model but is GPU pure this is a good option. Requires 13GB of RAM, but it will be slower.