Edit model card

Model Card for emlinking/wav2vec2-large-xls-r-300m-tsm-asr-v6

An automatic speech recognition model for Taiwanese Southern Min which generates transcriptions in the T芒i-l么 orthography.

Model Details

Model Description

An automatic speech recognition model for Taiwanese Southern Min which generates transcriptions in the T芒i-l么 orthography.

  • Developed by: Eleanor Lin
  • Language(s) (NLP): Taiwanese
  • Finetuned from model: facebook/wav2vec2-xls-r-300m

Model Sources

  • Paper: Babu, A., Wang, C., Tjandra, A., Lakhotia, K., Xu, Q., Goyal, N., ... & Auli, M. (2021). XLS-R: Self-supervised cross-lingual speech representation learning at scale. arXiv preprint arXiv:2111.09296.

Uses

This model can be used to transcribe Taiwanese speech in the T芒i-l么 orthography, e.g. to automatically generate transcripts of videos or podcasts.

Training Details

Training Data

This model is fine-tuned on 9.57 hours of Taiwanese speech (10,949 spoken utterances) from the following sources:

Training Procedure

Preprocessing

All punctuation except for hyphens ("-") are removed from the transcriptions and audio is resampled to 16kHz.

Training Hyperparameters

  • Training regime: per-device training batch size=8, gradient accumulation steps=2, fp16 16-bit (mixed) precision training, group_by_length=True, learning rate=3e-4, warmup steps=500, epochs=30

Testing Data, Factors & Metrics

Testing Data

TAT Speech-to-Speech Translation Benchmark validation set

Metrics

Word error rate

Results

Validation set WER = 0.666

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: Tesla T4 GPU
  • Hours used: 10.4

Software

This model was fine-tuned using free Google Colab GPU time.

Citation

Eleanor Lin. Developing Performant Models for Translating Spoken Taiwanese Into Spoken English Using Free and Publicly Available Resources. Columbia University Program of Linguistics, April 2024. Undergraduate thesis. Thesis

BibTeX:

Forthcoming

APA:

Forthcoming

Model Card Authors [optional]

Eleanor Lin

Model Card Contact

e.lin2@columbia.edu

Downloads last month
12
Safetensors
Model size
316M params
Tensor type
F32
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.