Update README.md
Browse files
README.md
CHANGED
@@ -38,17 +38,48 @@ It achieves the following results on the evaluation set:
|
|
38 |
- Loss: 0.3123
|
39 |
- Wer: 18.9229
|
40 |
|
41 |
-
## Model description
|
42 |
-
|
43 |
-
More information needed
|
44 |
-
|
45 |
## Intended uses & limitations
|
46 |
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
52 |
|
53 |
## Training procedure
|
54 |
|
|
|
38 |
- Loss: 0.3123
|
39 |
- Wer: 18.9229
|
40 |
|
|
|
|
|
|
|
|
|
41 |
## Intended uses & limitations
|
42 |
|
43 |
+
This model can be used in various application areas, including
|
44 |
+
|
45 |
+
- Transcription of Turkish language
|
46 |
+
- Voice commands
|
47 |
+
- Automatic subtitling for Turkish videos
|
48 |
+
|
49 |
+
## How To Use
|
50 |
+
|
51 |
+
```python
|
52 |
+
import time
|
53 |
+
import torch
|
54 |
+
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
|
55 |
+
|
56 |
+
device = "cuda:0" if torch.cuda.is_available() else "cpu"
|
57 |
+
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
|
58 |
+
|
59 |
+
model_id = "selimc/whisper-large-v3-turbo-turkish"
|
60 |
+
|
61 |
+
model = AutoModelForSpeechSeq2Seq.from_pretrained(
|
62 |
+
model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
|
63 |
+
)
|
64 |
+
model.to(device)
|
65 |
+
|
66 |
+
processor = AutoProcessor.from_pretrained(model_id)
|
67 |
+
|
68 |
+
pipe = pipeline(
|
69 |
+
"automatic-speech-recognition",
|
70 |
+
model=model,
|
71 |
+
tokenizer=processor.tokenizer,
|
72 |
+
feature_extractor=processor.feature_extractor,
|
73 |
+
chunk_length_s=30,
|
74 |
+
batch_size=16,
|
75 |
+
return_timestamps=True,
|
76 |
+
torch_dtype=torch_dtype,
|
77 |
+
device=device,
|
78 |
+
)
|
79 |
+
|
80 |
+
result = pipe("test.mp3")
|
81 |
+
print(result["text"])
|
82 |
+
```
|
83 |
|
84 |
## Training procedure
|
85 |
|