ayoubkirouane commited on
Commit
5ade76e
1 Parent(s): 99672b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -30
README.md CHANGED
@@ -3,55 +3,53 @@ language:
3
  - ar
4
  license: apache-2.0
5
  base_model: openai/whisper-small
6
- tags:
7
- - generated_from_trainer
8
  datasets:
9
  - mozilla-foundation/common_voice_11_0
10
  model-index:
11
- - name: Arabic-Whisper Small
12
  results: []
 
 
13
  ---
14
 
15
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
- should probably proofread and complete it, then remove this comment. -->
17
-
18
  # Arabic-Whisper Small
19
 
20
- This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the mozilla-foundation/common_voice_11_0 dataset.
 
 
 
 
21
 
22
- ## Model description
23
 
24
- More information needed
25
 
26
- ## Intended uses & limitations
27
 
28
- More information needed
29
 
30
- ## Training and evaluation data
31
 
32
- More information needed
33
 
34
- ## Training procedure
35
 
36
- ### Training hyperparameters
37
 
38
- The following hyperparameters were used during training:
39
- - learning_rate: 1e-05
40
- - train_batch_size: 16
41
- - eval_batch_size: 8
42
- - seed: 42
43
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
- - lr_scheduler_type: linear
45
- - lr_scheduler_warmup_steps: 250
46
- - training_steps: 500
47
 
48
- ### Training results
49
 
 
50
 
 
 
 
 
51
 
52
- ### Framework versions
53
 
54
- - Transformers 4.33.2
55
- - Pytorch 2.0.1+cu118
56
- - Datasets 2.14.5
57
- - Tokenizers 0.13.3
 
3
  - ar
4
  license: apache-2.0
5
  base_model: openai/whisper-small
 
 
6
  datasets:
7
  - mozilla-foundation/common_voice_11_0
8
  model-index:
9
+ - name: Whisper-small-ar
10
  results: []
11
+ library_name: transformers
12
+ pipeline_tag: automatic-speech-recognition
13
  ---
14
 
 
 
 
15
  # Arabic-Whisper Small
16
 
17
+ ## Description
18
+
19
+ Whisper-small-ar is an Automatic Speech Recognition (ASR) model fine-tuned specifically for the Arabic language using the Whisper model architecture. ASR models are designed to convert spoken language into written text. This model has been fine-tuned on the Mozilla Common Voice dataset (version 11.0) to transcribe spoken Arabic speech into textual form.
20
+
21
+ ### Key Features
22
 
23
+ - **Arabic Language Support:** Whisper-small-ar is optimized for recognizing and transcribing the Arabic language accurately. It can handle various Arabic dialects and accents.
24
 
25
+ - **Transformer Architecture:** The model is built on a powerful Transformer-based encoder-decoder architecture, which has demonstrated state-of-the-art performance in various natural language processing tasks, including ASR.
26
 
27
+ - **Fine-tuned for Arabic ASR:** The model has undergone a fine-tuning process on a substantial amount of Arabic speech data, making it well-suited for a wide range of ASR applications in Arabic, such as transcription of podcasts, call center recordings, and more.
28
 
29
+ - **Open-Source:** Whisper-small-ar is open-source and available for use by the research and developer community, facilitating the advancement of ASR technology for the Arabic language.
30
 
31
+ - **Compatible with Hugging Face Transformers:** You can easily integrate and utilize this model in your ASR projects using the Hugging Face Transformers library.
32
 
33
+ ### Use Cases
34
 
35
+ Whisper-small-ar can be employed in a variety of ASR use cases, including:
36
 
37
+ - **Transcription Services:** Convert spoken Arabic content, such as audio recordings, podcasts, or videos, into written text for indexing, search, or translation purposes.
38
 
39
+ - **Voice Assistants:** Enhance voice-activated systems and virtual assistants with accurate Arabic speech recognition capabilities.
 
 
 
 
 
 
 
 
40
 
41
+ - **Language Processing Applications:** Integrate the model into applications involving Arabic language processing, such as sentiment analysis, keyword extraction, and more.
42
 
43
+ - **Multilingual ASR:** Combine Whisper-small-ar with other multilingual ASR models for applications requiring recognition of multiple languages.
44
 
45
+ ## Usage
46
+ ```python
47
+ # Use a pipeline as a high-level helper
48
+ from transformers import pipeline
49
 
50
+ pipe = pipeline("automatic-speech-recognition", model="ayoubkirouane/whisper-small-ar")
51
 
52
+ def transcribe(audio):
53
+ text = pipe(audio)["text"]
54
+ return text
55
+ ```