metadata

thumbnail: https://github.com/rinnakk/japanese-pretrained-models/blob/master/rinna.png
language: ja
license: apache-2.0
datasets: reazon-research/reazonspeech
inference: false
tags:
  - hubert
  - speech

`rinna/japanese-hubert-large`

Overview

This is a Japanese HuBERT Large model trained by rinna Co., Ltd.

Model summary

The model architecture is the same as the original HuBERT Large model, which contains 24 transformer layers with 16 attention heads. The model was trained using code from the official repository, and the detailed training configuration can be found in the same repository and the original paper.
Training

The model was trained on approximately 19,000 hours of following Japanese speech corpus ReazonSpeech v1.
- ReazonSpeech
Contributors

How to use the model

import soundfile as sf
from transformers import AutoFeatureExtractor, AutoModel

model_name = "rinna/japanese-hubert-large"
feature_extractor = AutoFeatureExtractor.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)
model.eval()

raw_speech_16kHz, sr = sf.read(audio_file)
inputs = feature_extractor(
    raw_speech_16kHz,
    return_tensors="pt",
    sampling_rate=sr,
)
outputs = model(**inputs)

print(f"Input:  {inputs.input_values.size()}")  # [1, #samples]
print(f"Output: {outputs.last_hidden_state.size()}")  # [1, #frames, 1024]

A fairseq checkpoint file can also be available here.

How to cite

@misc{rinna-japanese-hubert-large, 
  title={rinna/japanese-hubert-large}, 
  author={Hono, Yukiya and Mitsui, Kentaro and Sawada, Kei},
  url={https://huggingface.co/rinna/japanese-hubert-large}
}

Citations

@article{hsu2021hubert,
  author={Hsu, Wei-Ning and Bolte, Benjamin and Tsai, Yao-Hung Hubert and Lakhotia, Kushal and Salakhutdinov, Ruslan and Mohamed, Abdelrahman},
  journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  title={HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units},
  year={2021},
  volume={29},
  number={},
  pages={3451-3460},
  doi={10.1109/TASLP.2021.3122291}
}

License

The Apache 2.0 license