πŸ“Œ Overview

A 4-bit AWQ quantized version of Google/medgemma-4b-it optimized for efficient inference using the MLX library, designed to handle long-context tasks (192k tokens) with reduced resource usage. Retains core capabilities of medgemma-4b while enabling deployment on edge devices.

Downloads last month
39
Safetensors
Model size
0.8B params
Tensor type
BF16
Β·
U32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Goraint/medgemma-4b-it-MLX-AWQ-4bit

Finetuned
(424)
this model