israelisraeli
commited on
Commit
•
7a64aed
1
Parent(s):
8f383ba
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,81 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- he
|
4 |
+
base_model:
|
5 |
+
- ivrit-ai/whisper-large-v3-turbo-d4-p1-take2
|
6 |
+
pipeline_tag: automatic-speech-recognition
|
7 |
+
tags:
|
8 |
+
- faster-whisper
|
9 |
+
---
|
10 |
+
|
11 |
+
# ivrit-faster-whisper-turbo-d4
|
12 |
+
|
13 |
+
This model is a conversion of the **ivrit-ai/whisper-large-v3-turbo-d4-p1-take2** model to the [**Faster-Whisper**](https://github.com/guillaumekln/faster-whisper) format, offering significantly faster inference times.
|
14 |
+
|
15 |
+
### Model Overview
|
16 |
+
|
17 |
+
- **Base Model**: [ivrit-ai/whisper-large-v3-turbo-d4-p1-take2](https://huggingface.co/ivrit-ai/whisper-large-v3-turbo-d4-p1-take2)
|
18 |
+
- **Converted to**: Faster-Whisper (for faster ASR with minimal performance loss)
|
19 |
+
- **Language**: Hebrew (`he`)
|
20 |
+
- **Quantization**: Float32
|
21 |
+
|
22 |
+
### All credits go to **ivrit-ai** for developing the original Whisper model.
|
23 |
+
|
24 |
+
## How to Use the Model
|
25 |
+
|
26 |
+
To use the model in your projects, follow the steps below to load and transcribe audio:
|
27 |
+
|
28 |
+
```python
|
29 |
+
# Import the Faster Whisper module
|
30 |
+
import faster_whisper
|
31 |
+
|
32 |
+
# Load the model from Hugging Face
|
33 |
+
model = faster_whisper.WhisperModel("israelisraeli/ivrit-faster-whisper-turbo-d4", device="cuda")
|
34 |
+
|
35 |
+
# Transcribe the audio file to JSON
|
36 |
+
segs, _ = model.transcribe("AUDIOFILE_efiTheTigger.mp3", language="he")
|
37 |
+
|
38 |
+
# Format the output into a list of dictionaries with timestamps and text
|
39 |
+
transcribed_segments_with_timestamps = [
|
40 |
+
{"start": s.start, "end": s.end, "text": s.text} for s in segs
|
41 |
+
]
|
42 |
+
|
43 |
+
import json
|
44 |
+
|
45 |
+
# Save the result to a JSON file
|
46 |
+
with open("transcribed_segments_with_timestamps.json", "w", encoding="utf-8") as json_file:
|
47 |
+
json.dump(
|
48 |
+
transcribed_segments_with_timestamps, json_file, ensure_ascii=False, indent=4
|
49 |
+
)
|
50 |
+
|
51 |
+
print("Transcription saved to transcribed_segments_with_timestamps.json")
|
52 |
+
```
|
53 |
+
|
54 |
+
|
55 |
+
## Conversion process
|
56 |
+
|
57 |
+
### Tokenizer Conversion
|
58 |
+
|
59 |
+
```python
|
60 |
+
from transformers import AutoTokenizer
|
61 |
+
|
62 |
+
# Load the tokenizer from the original Whisper model files
|
63 |
+
tokenizer_directory = "path_to_whisper_model_files"
|
64 |
+
tokenizer = AutoTokenizer.from_pretrained(tokenizer_directory)
|
65 |
+
|
66 |
+
# Save the tokenizer into a single JSON file
|
67 |
+
tokenizer.save_pretrained("path_to_save_directory", legacy_format=False)
|
68 |
+
```
|
69 |
+
|
70 |
+
### Model Conversion to Faster-Whisper
|
71 |
+
|
72 |
+
To convert the original [ivrit-ai/whisper-large-v3-turbo-d4-p1-take2](https://huggingface.co/ivrit-ai/whisper-large-v3-turbo-d4-p1-take2) model to the Faster-Whisper format, i used the CTranslate2 library. The following command was used for the conversion:
|
73 |
+
|
74 |
+
```bash
|
75 |
+
ct2-transformers-converter \
|
76 |
+
--model ./whisper-large-v3-turbo-d4-p1-take2 \
|
77 |
+
--output_dir ./ivrit-faster-whisper-turbo-d4 \
|
78 |
+
--copy_files tokenizer.json preprocessor_config.json \
|
79 |
+
```
|
80 |
+
|
81 |
+
|