README.md · totetecdev/whisper-large-v2-uzbek-100steps at main

# Whisper Large v2 Uzbek Speech Recognition Model

This project contains a fine-tuned version of the Faster Whisper Large v2 model for Uzbek speech recognition. The model can be used to transcribe Uzbek audio files into text.

## Installation

1. Ensure you have Python 3.7 or higher installed.

2. Install the required libraries:

   
pip install transformers datasets accelerate soundfile librosa torch
   

## Usage

You can use the model with the following Python code:

```python
from transformers import pipeline, WhisperForConditionalGeneration, WhisperProcessor
import torch

# Load the model and processor
model_name = "totetecdev/whisper-large-v2-uzbek-100steps" 
model = WhisperForConditionalGeneration.from_pretrained(model_name)
processor = WhisperProcessor.from_pretrained(model_name)

# Create the speech recognition pipeline
pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    torch_dtype=torch.float16,
    device_map="auto",
)

# Transcribe an audio file
audio_file = "path/to/your/audio/file.wav"  # Replace with the path to your audio file
result = pipe(audio_file)

print(result["text"])

Example Usage

Prepare your audio file (it should be in WAV format).
Save the above code in a Python file (e.g., transcribe.py).
Update the model_name and audio_file variables in the code with your values.
Run the following command in your terminal or command prompt:
```
python transcribe.py
```
The transcribed text will be displayed on the screen.

Notes

This model will perform best with Uzbek audio files.
Longer audio files may require more processing time.
GPU usage is recommended, but the model can also run on CPU.

If you're using Google Colab, you can upload your audio file using:

from google.colab import files
uploaded = files.upload()
audio_file = next(iter(uploaded))

Model Details

Base Model: Faster Whisper Large v2
Fine-tuned for: Uzbek Speech Recognition

License

This project is licensed under [LICENSE]. See the LICENSE file for details.

Contact

For questions or feedback, please contact [KHABIB SALIMOV] at [totetec.dev@gmail.com].

Acknowledgements

OpenAI for the original Whisper model