MADLAD-400 3B-MT — MLX (Apple Silicon)

Quantized MLX port of google/madlad400-3b-mt for on-device, many-to-many translation across 400+ languages on Apple Silicon. Apache 2.0.

Architecture

T5 v1.1 encoder-decoder with relative position bias:

  • 32 encoder + 32 decoder layers
  • d_model = 2048, d_kv = 128, num_heads = 16, d_ff = 16384
  • Gated GeLU FFN (wi_0, wi_1, wo)
  • RMSNorm pre-norm, no biases
  • Relative position bias (32 buckets, max distance 128) — first layer of each stack
  • Separate lm_head (NOT tied to embeddings)
  • SentencePiece vocabulary, 256,512 tokens (includes 400+ <2xx> target-language tokens)

Variants

Variant Size Path
INT4 ~1.6 GB int4/model.safetensors
INT8 ~3.1 GB int8/model.safetensors

Each variant includes config.json, tokenizer.json, tokenizer_config.json, special_tokens_map.json, and spiece.model.

Usage

import MADLADTranslation

let translator = try await MADLADTranslator.fromPretrained(quantization: .int4)
let es = try translator.translate("Hello, how are you?", to: "es")
// → "Hola, ¿cómo estás?"

let zh = try translator.translate("Where is the library?", to: "zh")
// → "图书馆在哪里?"

The target language is the only required parameter — MADLAD auto-detects the source language from the input text. Specify it as an ISO 639-1 code (or any of MADLAD's supported language tags); the tokenizer turns it into a leading <2{lang}> token.

CLI:

audio translate "Good morning" --to fr
audio transcribe meeting.wav | audio translate --to es

Part of the soniqo speech toolkit for Apple Silicon.

Conversion

Quantized directly from google/madlad400-3b-mt using mx.quantize() (group_size=64). The duplicate encoder/decoder.embed_tokens.weight keys are dropped — both encoder and decoder reuse shared.weight directly. Linear projections (q/k/v/o, wi_0/wi_1/wo, lm_head, shared) are quantized; RMSNorm scales and the relative-position-bias table stay as fp16.

License

Apache 2.0 (inherited from the base model).


Downloads last month

-

Downloads are not tracked for this model. How to track
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aufklarer/MADLAD400-3B-MT-MLX

Finetuned
(5)
this model

Collection including aufklarer/MADLAD400-3B-MT-MLX