totetecdev
/

whisper-large-v2-uzbek-100steps

Model card Files Files and versions Community

whisper-large-v2-uzbek-100steps / README.md

totetecdev's picture

Update README.md

8ec8c4c verified 4 months ago

|

history blame contribute delete

2.35 kB

	```markdown
	# Whisper Large v2 Uzbek Speech Recognition Model

	This project contains a fine-tuned version of the Faster Whisper Large v2 model for Uzbek speech recognition. The model can be used to transcribe Uzbek audio files into text.

	## Installation

	1. Ensure you have Python 3.7 or higher installed.

	2. Install the required libraries:


	pip install transformers datasets accelerate soundfile librosa torch


	## Usage

	You can use the model with the following Python code:

	```python
	from transformers import pipeline, WhisperForConditionalGeneration, WhisperProcessor
	import torch

	# Load the model and processor
	model_name = "totetecdev/whisper-large-v2-uzbek-100steps"
	model = WhisperForConditionalGeneration.from_pretrained(model_name)
	processor = WhisperProcessor.from_pretrained(model_name)

	# Create the speech recognition pipeline
	pipe = pipeline(
	"automatic-speech-recognition",
	model=model,
	tokenizer=processor.tokenizer,
	feature_extractor=processor.feature_extractor,
	torch_dtype=torch.float16,
	device_map="auto",
	)

	# Transcribe an audio file
	audio_file = "path/to/your/audio/file.wav" # Replace with the path to your audio file
	result = pipe(audio_file)

	print(result["text"])
	```

	## Example Usage

	1. Prepare your audio file (it should be in WAV format).
	2. Save the above code in a Python file (e.g., `transcribe.py`).
	3. Update the `model_name` and `audio_file` variables in the code with your values.
	4. Run the following command in your terminal or command prompt:

	```
	python transcribe.py
	```

	5. The transcribed text will be displayed on the screen.

	## Notes

	- This model will perform best with Uzbek audio files.
	- Longer audio files may require more processing time.
	- GPU usage is recommended, but the model can also run on CPU.
	- If you're using Google Colab, you can upload your audio file using:

	```python
	from google.colab import files
	uploaded = files.upload()
	audio_file = next(iter(uploaded))
	```

	## Model Details

	- Base Model: Faster Whisper Large v2
	- Fine-tuned for: Uzbek Speech Recognition

	## License

	This project is licensed under [LICENSE]. See the LICENSE file for details.

	## Contact

	For questions or feedback, please contact [KHABIB SALIMOV] at [totetec.dev@gmail.com].

	## Acknowledgements

	- OpenAI for the original Whisper model

	```