|
--- |
|
language: |
|
- he |
|
base_model: |
|
- ivrit-ai/whisper-large-v3-turbo-d4-p1-take2 |
|
pipeline_tag: automatic-speech-recognition |
|
tags: |
|
- faster-whisper |
|
--- |
|
|
|
# ivrit-faster-whisper-turbo-d4 |
|
|
|
This model is a conversion of the **ivrit-ai/whisper-large-v3-turbo-d4-p1-take2** model to the [**Faster-Whisper**](https://github.com/guillaumekln/faster-whisper) format, offering significantly faster inference times. |
|
|
|
### Model Overview |
|
|
|
- **Base Model**: [ivrit-ai/whisper-large-v3-turbo-d4-p1-take2](https://huggingface.co/ivrit-ai/whisper-large-v3-turbo-d4-p1-take2) |
|
- **Converted to**: Faster-Whisper (for faster ASR with minimal performance loss) |
|
- **Language**: Hebrew (`he`) |
|
- **Quantization**: Float32 |
|
|
|
### All credits go to **ivrit-ai** for developing the original Whisper model. |
|
|
|
## How to Use the Model |
|
|
|
To use the model in your projects, follow the steps below to load and transcribe audio: |
|
|
|
```python |
|
# Import the Faster Whisper module |
|
import faster_whisper |
|
|
|
# Load the model from Hugging Face |
|
model = faster_whisper.WhisperModel("israelisraeli/ivrit-faster-whisper-turbo-d4", device="cuda") |
|
|
|
# Transcribe the audio file to JSON |
|
segs, _ = model.transcribe("AUDIOFILE_efiTheTigger.mp3", language="he") |
|
|
|
# Format the output into a list of dictionaries with timestamps and text |
|
transcribed_segments_with_timestamps = [ |
|
{"start": s.start, "end": s.end, "text": s.text} for s in segs |
|
] |
|
|
|
import json |
|
|
|
# Save the result to a JSON file |
|
with open("transcribed_segments_with_timestamps.json", "w", encoding="utf-8") as json_file: |
|
json.dump( |
|
transcribed_segments_with_timestamps, json_file, ensure_ascii=False, indent=4 |
|
) |
|
|
|
print("Transcription saved to transcribed_segments_with_timestamps.json") |
|
``` |
|
|
|
|
|
## Conversion process |
|
|
|
### Tokenizer Conversion |
|
|
|
```python |
|
from transformers import AutoTokenizer |
|
|
|
# Load the tokenizer from the original Whisper model files |
|
tokenizer_directory = "path_to_whisper_model_files" |
|
tokenizer = AutoTokenizer.from_pretrained(tokenizer_directory) |
|
|
|
# Save the tokenizer into a single JSON file |
|
tokenizer.save_pretrained("path_to_save_directory", legacy_format=False) |
|
``` |
|
|
|
### Model Conversion to Faster-Whisper |
|
|
|
To convert the original [ivrit-ai/whisper-large-v3-turbo-d4-p1-take2](https://huggingface.co/ivrit-ai/whisper-large-v3-turbo-d4-p1-take2) model to the Faster-Whisper format, i used the CTranslate2 library. The following command was used for the conversion: |
|
|
|
```bash |
|
ct2-transformers-converter \ |
|
--model ./whisper-large-v3-turbo-d4-p1-take2 \ |
|
--output_dir ./ivrit-faster-whisper-turbo-d4 \ |
|
--copy_files tokenizer.json preprocessor_config.json \ |
|
``` |
|
|
|
|
|
|