Daniel (Unsloth)'s picture

Daniel (Unsloth) PRO

danielhanchen

·

https://unsloth.ai/

AI & ML interests

None yet

Recent Activity

updated a model about 2 hours ago

unsloth/Mistral-Large-3-675B-Base-2512

updated a model about 3 hours ago

unsloth/Mistral-Large-3-675B-Instruct-2512-Eagle

updated a model about 3 hours ago

unsloth/Mistral-Large-3-675B-Instruct-2512-NVFP4

View all activity

Organizations

upvoted a collection 4 days ago

Ministral 3

Mistral Ministral 3: new multimodal models in Base, Instruct, and Reasoning variants, available in 3B, 8B, and 14B sizes. • 28 items • Updated 3 days ago • 19

upvoted an article 17 days ago

Article

Introducing Cogito v2.1

17 days ago

•

17

upvoted a paper about 1 month ago

Efficient Long-context Language Model Training by Core Attention Disaggregation

Paper • 2510.18121 • Published Oct 20 • 120

upvoted a paper 6 months ago

Speechless: Speech Instruction Training Without Speech for Low Resource Languages

Paper • 2505.17417 • Published May 23 • 14

upvoted 2 collections 7 months ago

TorchAO Quantized Phi-4-mini-instruct

TorchAO quantized Phi-4-mini-instruct models from PyTorch team, runnable in A100, H100 through vLLM and in mobile devices through ExecuTorch • 4 items • Updated Sep 10 • 3

Unsloth Dynamic 2.0 Quants

New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. • 59 items • Updated 4 days ago • 257

upvoted an article 8 months ago

Article

Comparing sub 50GB Llama 4 Scout quants (KLD/Top P)

Apr 9

•

44

upvoted a collection 9 months ago

Qwen2.5-VL (All Versions)

All versions of Qwen2.5-VL including the new 32B version and 4-bit, 16-bit and more! • 16 items • Updated 4 days ago • 22

upvoted 2 collections 11 months ago

DeepSeek R1 (All Versions)

DeepSeek-R1-0528 is here! The most powerful reasoning open LLM, available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 37 items • Updated 4 days ago • 261

Phi-4 (All Versions)

Microsoft's Phi-4 models including Reasoning + Reasoning Plus & mini. Includes Dynamic 2.0 GGUF, 4-bit & 16-bit versions. Includes Unsloth's bug fixes • 20 items • Updated 4 days ago • 76

upvoted 2 collections 12 months ago

Unsloth 4-bit Dynamic Quants

Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 28 items • Updated 4 days ago • 91

Llama 3.3 (All Versions)

Meta's new Llama 3.3 (70B) model in all formats. Includes GGUF, 4-bit bnb and original versions. • 3 items • Updated 4 days ago • 37

upvoted a collection about 1 year ago

Qwen 2.5 Coder

Complete collection of Code-specific model series for Qwen2.5 in bnb 4bit, 16bit and GGUF formats. • 35 items • Updated 4 days ago • 35

upvoted an article about 1 year ago

Article

Fixing Gradient Accumulation

+4

Oct 16, 2024

•

63

upvoted 3 collections about 1 year ago

4bit Instruct Models

18 items • Updated 4 days ago • 33

Load 4bit models 4x faster

Native bitsandbytes 4bit pre quantized models • 25 items • Updated 4 days ago • 59

Llama 3.2

Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. • 27 items • Updated 4 days ago • 68

upvoted an article over 1 year ago

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

Jul 29, 2024

•

364

upvoted a collection over 1 year ago

Navarasa 2.0 Models

Collection of models Navarasa 2.0 Models finetuned with Gemma on 15 Indian languages • 5 items • Updated Mar 18, 2024 • 22

upvoted a collection almost 2 years ago

OpenCodeInterpreter

18 items • Updated Mar 3, 2024 • 84