You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

🇲🇦 Terjman-Supreme-v2.0 (3.3B) 🚀

Terjman-Ultra-v2.0 is an improved version of atlasia/Terjman-Ultra-v1, built on the powerful Transformer architecture and fine-tuned for high-quality, accurate translations.

This version is still based on facebook/nllb-200-3.3B but has been trained on a larger and more refined dataset, leading to improved translation performance. The model achieves results on par with gpt-4o-2024-08-06 on TerjamaBench, an evaluation benchmark for English-Moroccan darija translation models, that challenges the models more on the cultural aspect.

🚀 Features

✅ Fine-tuned for English->Moroccan darija translation.
✅ State-of-the-art performance among open-source models.
✅ Compatible with 🤗 Transformers and easily deployable on various hardware setups.

🔥 Performance Comparison

The following table compares Terjman-Supreme-v2.0 against proprietary and open-source models using BLEU, chrF, and TER scores. Higher BLEU/chrF and lower TER indicate better translation quality.

Model	Size	BLEU↑	chrF↑	TER↓
Proprietary Models
gemini-exp-1206	*	30.69	54.16	67.62
claude-3-5-sonnet-20241022	*	30.51	51.80	67.42
gpt-4o-2024-08-06	*	28.30	50.13	71.77
Open-Source Models
Terjman-Ultra-v2.0	1.3B	25.00	44.70	77.20
Terjman-Supreme-v2.0 (This model)	3.3B	23.43	44.57	78.17
Terjman-Large-v2.0	240M	22.67	42.57	83.00
Terjman-Nano-v2.0	77M	18.84	38.41	94.73
atlasia/Terjman-Large-v1.2	240M	16.33	37.10	89.13
MBZUAI-Paris/Atlas-Chat-9B	9B	14.80	35.26	93.95
facebook/nllb-200-3.3B	3.3B	14.76	34.17	94.33
atlasia/Terjman-Nano	77M	09.98	26.55	106.49

🔬 Model Details

Base Model: facebook/nllb-200-3.3B
Architecture: Transformer-based sequence-to-sequence model
Training Data: High-quality parallel corpora with high quality translations
Training Precision: FP16 for efficient inference

🚀 How to Use

You can use the model with the Hugging Face Transformers library:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "BounharAbdelaziz/Terjman-Supreme-v2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

def translate(text, src_lang="eng_Latn", tgt_lang="ary_Arab"):
    inputs = tokenizer(text, return_tensors="pt", src_lang=src_lang, tgt_lang=tgt_lang)
    output = model.generate(**inputs)
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Example translation
text = "Hello there! Today the weather is so nice in Geneva, couldn't ask for more to enjoy the holidays :)"
translation = translate(text)
print("Translation:", translation)
# prints: صباح الخير! اليوم الطقس زوين بزاف فجنيف، ما قدرتش نطلب أكثر باش نتمتع بالعطلة :)

🖥️ Deployment

Run in a Hugging Face Space

Try the model interactively in the Terjman-Ultra Space 🤗

Use with Text Generation Inference (TGI)

For fast inference, use Hugging Face TGI:

pip install text-generation
text-generation-launcher --model-id BounharAbdelaziz/Terjman-Supreme-v2.0

Run Locally with Transformers & PyTorch

pip install transformers torch
python -c "from transformers import pipeline; print(pipeline('translation', model='BounharAbdelaziz/Terjman-Supreme-v2.0')('Hello there!'))"

Deploy on an API Server

Use FastAPI to serve translations as an API:

from fastapi import FastAPI
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

app = FastAPI()
model_name = "BounharAbdelaziz/Terjman-Supreme-v2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

@app.get("/translate/")
def translate(text: str):
    inputs = tokenizer(text, return_tensors="pt", src_lang="eng_Latn", tgt_lang="ary_Arab")
    output = model.generate(**inputs)
    return {"translation": tokenizer.decode(output[0], skip_special_tokens=True)}

🛠️ Training Details Hyperparameters**

The model was fine-tuned using the following training settings:

Learning Rate: 0.0005
Training Batch Size: 1
Evaluation Batch Size: 1
Seed: 42
Gradient Accumulation Steps: 64
Total Effective Batch Size: 64
Optimizer: AdamW (Torch) with betas=(0.9,0.999), epsilon=1e-08
Learning Rate Scheduler: Linear
Warmup Ratio: 0.1
Epochs: 3
Precision: Mixed FP16 for efficient training

📜 License

This model is released under the CC BY-NC (Creative Commons Attribution-NonCommercial) license, meaning it can be used for research and personal projects but not for commercial purposes. For commercial use, please get in touch :)

Framework versions

Transformers 4.47.1
Pytorch 2.5.1+cu124
Datasets 3.1.0
Tokenizers 0.21.0

@misc{terjman-v2,
  title = {Terjman-v2: High-Quality English-Moroccan Darija Translation Model},
  author={Abdelaziz Bounhar},
  year={2025},
  howpublished = {\url{https://huggingface.co/BounharAbdelaziz/Terjman-Supreme-v2.0}},
  license = {CC BY-NC}
}