2 7

Nguyễn Thành Đô

nguyenthanhdo

staticpunch

AI & ML interests

I'm building open source LLMs that work for Vietnamese

Recent Activity

updated a model 16 days ago

nguyenthanhdo/mMiniLMv2-L12-H384-distilled-from-XLMR-Large

View all activity

Organizations

nguyenthanhdo's activity

updated a model 16 days ago

nguyenthanhdo/mMiniLMv2-L12-H384-distilled-from-XLMR-Large

Updated 16 days ago • 5

updated 3 models 3 months ago

updated a dataset 4 months ago

continual-pretraining/japanese-dedup

Viewer • Updated Jul 25 • 219M • 32 • 1

updated 7 models 5 months ago

nguyenthanhdo/ViMath-PAL-Qwen2-7B-LORA

Updated Jul 2 • 1

nguyenthanhdo/ViMath-PAL-CodeQwen1.5-7B-LORA

Updated Jul 2

nguyenthanhdo/ViMath-PAL-Llama-3-8B-LORA

Updated Jul 2

nguyenthanhdo/ViMath-PAL-deepseek-math-7B-LORA

Updated Jul 2 • 2

nguyenthanhdo/ViMath-CodeQwen1.5-7B-LORA

Updated Jul 1

nguyenthanhdo/ViMath-Qwen2-7B-LORA

Updated Jun 30 • 5

nguyenthanhdo/ViMath-Llama-3-8B-LORA

Updated Jun 30 • 2

liked a model 5 months ago

Sao10K/L3-70B-Euryale-v2.1

Text Generation • Updated Jun 14 • 569 • 119

liked 2 datasets 6 months ago

NousResearch/CharacterCodex

Viewer • Updated Jun 17 • 15.9k • 198 • 209

TheSkullery/Aether-Lite-V1.2

Viewer • Updated Jun 6 • 82k • 49 • 4

New activity in Squish42/bluemoon-fandom-1-1-rp-cleaned 6 months ago

Missing characters' descriptions.

#2 opened 6 months ago by

nguyenthanhdo

Reacted to mlabonne's post with ❤️ 6 months ago

Post

9239

⚡ AutoQuant

AutoQuant is the evolution of my previous AutoGGUF notebook (https://colab.research.google.com/drive/1P646NEg33BZy4BfLDNpTz0V0lwIU3CHu). It allows you to quantize your models in five different formats:

- GGUF: perfect for inference on CPUs (and LM Studio)
- GPTQ/EXL2: fast inference on GPUs
- AWQ: super fast inference on GPUs with vLLM (https://github.com/vllm-project/vllm)
- HQQ: extreme quantization with decent 2-bit and 3-bit models

Once the model is converted, it automatically uploads it on the Hugging Face Hub. To quantize a 7B model, GGUF only needs a T4 GPU, while the other methods require an A100 GPU.

Here's an example of a model I quantized using HQQ and AutoQuant: mlabonne/AlphaMonarch-7B-2bit-HQQ

I hope you'll enjoy it and quantize lots of models! :)

💻 AutoQuant: https://colab.research.google.com/drive/1b6nqC7UZVt8bx4MksX7s656GXPM-eWw4

15 replies

updated a dataset 7 months ago

roleplay4fun/CoupleRP

Viewer • Updated May 15 • 52k • 41 • 4

liked a dataset 7 months ago

Norquinal/claude_multiround_chat_30k

Viewer • Updated Jul 13, 2023 • 32.2k • 47 • 53

liked a model 7 months ago

Virt-io/SillyTavern-Presets

Updated Sep 25 • 240