Speech Language Models
Collection
6 items
•
Updated
reazon-research/japanese-wav2vec2-large
This is a Japanese wav2vec 2.0 Large model pre-trained on ReazonSpeech v2.0 corpus.
We also release the CTC model reazon-research/japanese-wav2vec2-large-rs35kh
derived from this model.
import librosa
import torch
from transformers import AutoFeatureExtractor, AutoModel
feature_extractor = AutoFeatureExtractor.from_pretrained("reazon-research/japanese-wav2vec2-large")
model = AutoModel.from_pretrained("reazon-research/japanese-wav2vec2-large")
audio, sr = librosa.load(audio_file, sr=16_000)
inputs = feature_extractor(
audio,
return_tensors="pt",
sampling_rate=sr,
)
with torch.inference_mode():
outputs = model(**inputs)
@misc{reazon-research-japanese-wav2vec2-large,
title={japanese-wav2vec2-large},
author={Sasaki, Yuta},
url = {https://huggingface.co/reazon-research/japanese-wav2vec2-large},
year = {2024}
}