benderrodriguez commited on
Commit
72a4fb3
1 Parent(s): fecf5c8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -3
README.md CHANGED
@@ -1,3 +1,37 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - he
5
+ - en
6
+ ---
7
+
8
+ Background
9
+ -
10
+
11
+ This ASR model was trained on a private dataset containing approximately 310 hours of high-quality Hebrew data.
12
+ Data was transcribed using professional transcription services.
13
+
14
+ Model name decoding:
15
+
16
+ \<model name\>-\<size\>-\<dataset\>-\<epoch\>
17
+
18
+ This specific model is a faster-whisper variant, large-v2 variant, trained on version 1 of our private dataset (pd1), and saved after one epoch.
19
+
20
+
21
+
22
+ Running the model
23
+ -
24
+
25
+ ```
26
+ # Initialize the model
27
+ import faster_whisper
28
+ model = faster_whisper.WhisperModel('ivrit-ai/faster-whisper-v2-pd1-e1')
29
+
30
+ # Transcribe a media file
31
+ segs, _ = model.transcribe(mp3_file, language='he')
32
+ for seg in segs:
33
+ print(seg.text)
34
+ ```
35
+
36
+ The segment object contains more data such as timestamps.
37
+ Feel free to explore them.