saattrupdan
/

wav2vec2-xls-r-300m-ftspeech

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

wav2vec2-xls-r-300m-ftspeech / README.md

saattrupdan's picture

Update README.md

73d80f5 over 2 years ago

|

1.47 kB

	---
	language:
	- da
	license: other
	tasks:
	- automatic-speech-recognition
	datasets:
	- ftspeech
	metrics:
	- wer
	model-index:
	- name: wav2vec2-xls-r-300m-ftspeech
	results:
	- task:
	type: automatic-speech-recognition
	dataset:
	type: mozilla-foundation/common_voice_8_0
	args: da
	name: Danish Common Voice 8.0
	metrics:
	- type: wer
	value: 17.91
	- task:
	type: automatic-speech-recognition
	dataset:
	type: Alvenir/alvenir_asr_da_eval
	name: Alvenir ASR test dataset
	metrics:
	- type: wer
	value: 13.84
	---

	# XLS-R-300m-FTSpeech

	## Model description

	This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the [FTSpeech dataset](https://ftspeech.github.io/), being a dataset of 1,800 hours of transcribed speeches from the Danish parliament.


	## Performance

	The model achieves the following WER scores (lower is better):

	\| Dataset \| WER without LM \| WER with 5-gram LM \|
	\| :---: \| ---: \| ---: \|
	\| [Danish part of Common Voice 8.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_8_0/viewer/da/train) \| 20.48 \| 17.91 \|
	\| [Alvenir test set](https://huggingface.co/datasets/Alvenir/alvenir_asr_da_eval) \| 15.46 \| 13.84 \|


	## License

	The use of this model needs to adhere to [this license from the Danish Parliament](https://www.ft.dk/da/aktuelt/tv-fra-folketinget/deling-og-rettigheder).