totetecdev commited on
Commit
28cfae8
1 Parent(s): 7141b4e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +79 -21
README.md CHANGED
@@ -1,31 +1,89 @@
1
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
- from peft import PeftModel, PeftConfig
4
- from transformers import WhisperForConditionalGeneration, WhisperProcessor, pipeline
5
 
6
- # PEFT model yolunu tanımlayın
7
- peft_model_id = "totetecdev/whisper-large-v2-uzbek-100steps"
8
 
9
- # PEFT konfigürasyonunu yükleyin
10
- peft_config = PeftConfig.from_pretrained(peft_model_id)
 
11
 
12
- # Temel modeli yükleyin
13
- model = WhisperForConditionalGeneration.from_pretrained(
14
- peft_config.base_model_name_or_path,
15
- load_in_8bit=True,
16
- device_map="auto"
 
 
 
 
 
 
 
 
17
  )
18
 
19
- # PEFT modelini yükleyin
20
- model = PeftModel.from_pretrained(model, peft_model_id)
 
 
 
 
 
 
 
 
 
 
 
21
 
22
- # Önbelleği etkinleştirin
23
- model.config.use_cache = True
 
24
 
25
- # Whisper işlemcisini yükleyin (tokenizer yerine)
26
- processor = WhisperProcessor.from_pretrained(peft_config.base_model_name_or_path)
27
 
28
- # Pipeline'ı oluşturun
29
- pipe = pipeline("automatic-speech-recognition", model=model, tokenizer=processor.tokenizer, feature_extractor=processor.feature_extractor)
30
 
31
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ```markdown
2
+ # Whisper Large v2 Uzbek Speech Recognition Model
3
+
4
+ This project contains a fine-tuned version of the Faster Whisper Large v2 model for Uzbek speech recognition. The model can be used to transcribe Uzbek audio files into text.
5
+
6
+ ## Installation
7
+
8
+ 1. Ensure you have Python 3.7 or higher installed.
9
+
10
+ 2. Install the required libraries:
11
+
12
+ ```
13
+ pip install transformers datasets accelerate soundfile librosa torch
14
+ ```
15
 
16
+ ## Usage
 
17
 
18
+ You can use the model with the following Python code:
 
19
 
20
+ ```python
21
+ from transformers import pipeline, WhisperForConditionalGeneration, WhisperProcessor
22
+ import torch
23
 
24
+ # Load the model and processor
25
+ model_name = "totetecdev/whisper-large-v2-uzbek-100steps"
26
+ model = WhisperForConditionalGeneration.from_pretrained(model_name)
27
+ processor = WhisperProcessor.from_pretrained(model_name)
28
+
29
+ # Create the speech recognition pipeline
30
+ pipe = pipeline(
31
+ "automatic-speech-recognition",
32
+ model=model,
33
+ tokenizer=processor.tokenizer,
34
+ feature_extractor=processor.feature_extractor,
35
+ torch_dtype=torch.float16,
36
+ device_map="auto",
37
  )
38
 
39
+ # Transcribe an audio file
40
+ audio_file = "path/to/your/audio/file.wav" # Replace with the path to your audio file
41
+ result = pipe(audio_file)
42
+
43
+ print(result["text"])
44
+ ```
45
+
46
+ ## Example Usage
47
+
48
+ 1. Prepare your audio file (it should be in WAV format).
49
+ 2. Save the above code in a Python file (e.g., `transcribe.py`).
50
+ 3. Update the `model_name` and `audio_file` variables in the code with your values.
51
+ 4. Run the following command in your terminal or command prompt:
52
 
53
+ ```
54
+ python transcribe.py
55
+ ```
56
 
57
+ 5. The transcribed text will be displayed on the screen.
 
58
 
59
+ ## Notes
 
60
 
61
+ - This model will perform best with Uzbek audio files.
62
+ - Longer audio files may require more processing time.
63
+ - GPU usage is recommended, but the model can also run on CPU.
64
+ - If you're using Google Colab, you can upload your audio file using:
65
+
66
+ ```python
67
+ from google.colab import files
68
+ uploaded = files.upload()
69
+ audio_file = next(iter(uploaded))
70
+ ```
71
+
72
+ ## Model Details
73
+
74
+ - Base Model: Faster Whisper Large v2
75
+ - Fine-tuned for: Uzbek Speech Recognition
76
+
77
+ ## License
78
+
79
+ This project is licensed under [LICENSE]. See the LICENSE file for details.
80
+
81
+ ## Contact
82
+
83
+ For questions or feedback, please contact [KHABIB SALIMOV] at [totete.dev@gmail.com].
84
+
85
+ ## Acknowledgements
86
+
87
+ - OpenAI for the original Whisper model
88
+
89
+ ```