YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Moroccan Darija Text-to-Speech Model

This model is a fine-tuned version of SpeechT5 for Moroccan Darija Text-to-Speech synthesis.

Model Details

  • Base Model: Microsoft SpeechT5
  • Fine-tuned on: DODa audio dataset
  • Languages: Moroccan Darija (Latin script)
  • Features: Multiple voice support (male/female)
  • Release Date: April 2025

Usage

from transformers import SpeechT5Processor, SpeechT5ForTextToSpeech, SpeechT5HifiGan
import torch
import soundfile as sf

# Load models
processor = SpeechT5Processor.from_pretrained("HAMMALE/speecht5-darija")
model = SpeechT5ForTextToSpeech.from_pretrained("HAMMALE/speecht5-darija")
vocoder = SpeechT5HifiGan.from_pretrained("microsoft/speecht5_hifigan")

# Load speaker embedding (replace with your own speaker embedding)
speaker_embedding = torch.randn(1, 512)  # Example embedding

# Process text
text = "Salam, kifach nta lyoum?"
inputs = processor(text=text, return_tensors="pt")

# Generate speech
speech = model.generate_speech(inputs["input_ids"], speaker_embedding, vocoder=vocoder)

# Save audio file
sf.write("output.wav", speech.numpy(), 16000)

Demo

A live demo is available at Hugging Face Spaces

License

This model is available under the MIT License.

Acknowledgments

Downloads last month
34
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using HAMMALE/speecht5-darija 1