Model Overview

This model is a fine-tuned version of Microsoft's SpeechT5 text-to-speech model, adapted to handle technical terminology, abbreviations, and domain-specific jargon. It has been trained on a custom dataset containing 100 text entries that are highly focused on terms used in technical interviews and professional communication. The fine-tuning process ensures accurate pronunciation of technical terms, improving the quality of TTS outputs in scenarios requiring domain expertise.

Model Details

Base Model: microsoft/speecht5_tts
Language: Lithuanian (lt)
License: MIT
Dataset: Custom English texts, primarily focused on technical terminology commonly encountered in fields such as computer science, engineering, and software development.

Dataset

Text Data: Contains 100 text entries, each including technical terms, abbreviations, and industry-specific vocabulary.The text length varies from short sentences to longer technical descriptions.
Audio Data: Corresponding synthesized audio generated for each text entry.Audio is encoded in WAV format, sampled at 16 kHz, and designed for TTS applications.

Usage

from transformers import AutoTokenizer, AutoModelForSpeechT5

tokenizer = AutoTokenizer.from_pretrained("Arch10/SpeechT5_finetune_technical_terms")
model = AutoModelForSpeechT5.from_pretrained("Arch10/SpeechT5_finetune_technical_terms")

Arch10
/

SpeechT5_finetune_technical_terms

Model Overview

Model Details

Dataset

Usage

Model tree for Arch10/SpeechT5_finetune_technical_terms