Instructions to use aufklarer/MADLAD400-3B-MT-MLX with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use aufklarer/MADLAD400-3B-MT-MLX with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir MADLAD400-3B-MT-MLX aufklarer/MADLAD400-3B-MT-MLX
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
MADLAD-400 3B-MT — MLX (Apple Silicon)
Quantized MLX port of google/madlad400-3b-mt for on-device, many-to-many translation across 400+ languages on Apple Silicon. Apache 2.0.
Architecture
T5 v1.1 encoder-decoder with relative position bias:
- 32 encoder + 32 decoder layers
- d_model = 2048, d_kv = 128, num_heads = 16, d_ff = 16384
- Gated GeLU FFN (
wi_0,wi_1,wo) - RMSNorm pre-norm, no biases
- Relative position bias (32 buckets, max distance 128) — first layer of each stack
- Separate
lm_head(NOT tied to embeddings) - SentencePiece vocabulary, 256,512 tokens (includes 400+
<2xx>target-language tokens)
Variants
| Variant | Size | Path |
|---|---|---|
| INT4 | ~1.6 GB | int4/model.safetensors |
| INT8 | ~3.1 GB | int8/model.safetensors |
Each variant includes config.json, tokenizer.json, tokenizer_config.json, special_tokens_map.json, and spiece.model.
Usage
import MADLADTranslation
let translator = try await MADLADTranslator.fromPretrained(quantization: .int4)
let es = try translator.translate("Hello, how are you?", to: "es")
// → "Hola, ¿cómo estás?"
let zh = try translator.translate("Where is the library?", to: "zh")
// → "图书馆在哪里?"
The target language is the only required parameter — MADLAD auto-detects the source language from the input text. Specify it as an ISO 639-1 code (or any of MADLAD's supported language tags); the tokenizer turns it into a leading <2{lang}> token.
CLI:
audio translate "Good morning" --to fr
audio transcribe meeting.wav | audio translate --to es
Part of the soniqo speech toolkit for Apple Silicon.
Conversion
Quantized directly from google/madlad400-3b-mt using mx.quantize() (group_size=64). The duplicate encoder/decoder.embed_tokens.weight keys are dropped — both encoder and decoder reuse shared.weight directly. Linear projections (q/k/v/o, wi_0/wi_1/wo, lm_head, shared) are quantized; RMSNorm scales and the relative-position-bias table stay as fp16.
License
Apache 2.0 (inherited from the base model).
- Guide: soniqo.audio/guides/translate
- Docs: soniqo.audio
- GitHub: soniqo/speech-swift
Quantized
Model tree for aufklarer/MADLAD400-3B-MT-MLX
Base model
google/madlad400-3b-mt