@ybelkada on Hugging Face: "Check out quantized weights from ISTA-DAS Lab directly in their organisation…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

ybelkada

posted an update Mar 7, 2024

Post

Check out quantized weights from ISTA-DAS Lab directly in their organisation page: https://huggingface.co/ISTA-DASLab ! With official weights of AQLM (for 2bit quantization) & QMoE (1-bit MoE quantization)

Read more about these techniques below:

AQLM paper: Extreme Compression of Large Language Models via Additive Quantization (2401.06118)
QMoE: QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models (2310.16795)

Some useful links below:

AQLM repo: https://github.com/Vahe1994/AQLM
How to use AQLM & transformers: https://huggingface.co/docs/transformers/quantization#aqlm
How to use AQLM & PEFT: https://huggingface.co/docs/peft/developer_guides/quantization#aqlm-quantizaion

Great work from @BlackSamorez and team !

In this post

ybelkada Younes Belkada