primeline
/

whisper-tiny-german

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

whisper-tiny-german / README.md

flozi00's picture

Update README.md

ead395e verified 8 months ago

|

3.55 kB

	---
	license: apache-2.0
	language:
	- de
	library_name: transformers
	pipeline_tag: automatic-speech-recognition
	---

	# whisper-tiny-german

	This model is a German Speech Recognition model based on the [whisper-tiny](https://huggingface.co/openai/whisper-tiny) model.
	The model weights count 37.8M parameters and with a size of 73MB in bfloat16 format.

	As a follow-up to the [Whisper large v3 german](https://huggingface.co/primeline/whisper-large-v3-german) we decided to create a tiny version to be used in edge cases where the model size is a concern.

	## Intended uses & limitations

	The model is intended to be used for German speech recognition tasks.
	It is designed to be used for edge cases where the model size is a concern.
	It's not recommended to use this model for critical use cases, as it is a tiny model and may not perform well in all scenarios.

	## Dataset

	The dataset used for training is a filtered subset of the [Common Voice](https://huggingface.co/datasets/common_voice) dataset, multilingual librispeech and some internal data.
	The data was filtered and double checked for quality and correctness.
	We did some normalization to the text data, especially for casing and punctuation.


	## Model family

	\| Model \| Parameters \| link \|
	\|----------------------------------\|------------\|--------------------------------------------------------------\|
	\| Whisper large v3 german \| 1.54B \| [link](https://huggingface.co/primeline/whisper-large-v3-german) \|
	\| Distil-whisper large v3 german \| 756M \| [link](https://huggingface.co/primeline/distil-whisper-large-v3-german) \|
	\| tiny whisper \| 37.8M \| [link](https://huggingface.co/primeline/whisper-tiny-german) \|

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3e-05
	- total_train_batch_size: 512
	- num_epochs: 5.0

	### Framework versions

	- Transformers 4.39.3
	- Pytorch 2.3.0a0+ebedce2
	- Datasets 2.18.0
	- Tokenizers 0.15.2


	### How to use

	```python
	import torch
	from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
	from datasets import load_dataset
	device = "cuda:0" if torch.cuda.is_available() else "cpu"
	torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
	model_id = "primeline/whisper-tiny-german"
	model = AutoModelForSpeechSeq2Seq.from_pretrained(
	model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
	)
	model.to(device)
	processor = AutoProcessor.from_pretrained(model_id)
	pipe = pipeline(
	"automatic-speech-recognition",
	model=model,
	tokenizer=processor.tokenizer,
	feature_extractor=processor.feature_extractor,
	max_new_tokens=128,
	chunk_length_s=30,
	batch_size=16,
	return_timestamps=True,
	torch_dtype=torch_dtype,
	device=device,
	)
	dataset = load_dataset("distil-whisper/librispeech_long", "clean", split="validation")
	sample = dataset[0]["audio"]
	result = pipe(sample)
	print(result["text"])
	```


	## [About us](https://primeline-ai.com/en/)

	[![primeline AI](https://primeline-ai.com/wp-content/uploads/2024/02/pl_ai_bildwortmarke_original.svg)](https://primeline-ai.com/en/)


	Your partner for AI infrastructure in Germany <br>
	Experience the powerful AI infrastructure that drives your ambitions in Deep Learning, Machine Learning & High-Performance Computing. Optimized for AI training and inference.



	Model author: [Florian Zimmermeister](https://huggingface.co/flozi00)