metadata

license: mit
language:
  - en
datasets:
  - openslr/librispeech_asr
base_model:
  - facebook/hubert-base-ls960

Improving Spoken Language Modeling with Phoneme Classification: A Simple Fine-tuning Approach

Paper: https://arxiv.org/abs/2410.00025 Presented at EMNLP 2024.

This branch contains the HuBERT model fine-tuned with phoneme classification on train-clean-100. See the companion repository: https://github.com/bootphon/spokenlm-phoneme.

Use it like this:

from phonslm import HuBERTPhoneme

model = HuBERTPhoneme.from_pretrained("coml/hubert-phoneme-classification")