Automatic Speech Recognition
audio
ericchin commited on
Commit
e112c07
1 Parent(s): 060a009

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -10
README.md CHANGED
@@ -123,13 +123,13 @@ Whisper is a Transformer based encoder-decoder model, also referred to as a sequ
123
 
124
  | Model Type | Parameters | n_audio_ctx | n_audio_state | n_audio_head | n_audio_layer | n_text_ctx | n_text_state | n_text_head | n_text_layer | n_mels | n_vocab |
125
  |---------------------------|------------|-------------|---------------|--------------|---------------|------------|--------------|-------------|--------------|--------|---------|
126
- | whisper-tiny | 39 M | 1500 | 384 | 6 | 4 | 224 | 384 | 6 | 4 | 80 | 51864 |
127
- | whisper-base | 74 M | 1500 | 512 | 8 | 6 | 224 | 512 | 8 | 6 | 80 | 51864 |
128
- | **whisper-small** | 244 M | 1500 | 768 | 12 | 12 | 224 | 768 | 12 | 12 | 80 | 51864 |
129
- | whisper-medium | 769 M | 1500 | 1024 | 16 | 24 | 224 | 1024 | 16 | 16 | 80 | 51864 |
130
- | whisper-large-v1 | 1550 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 20 | 80 | 51864 |
131
- | whisper-large-v2 | 1550 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 20 | 80 | 51864 |
132
- | distil-whisper-large-v2 | 756 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 2 | 80 | 51864 |
133
- | whisper-large-v3 | 1550 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 20 | 128 | 51865 |
134
- | distil-whisper-large-v3 | 756 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 2 | 128 | 51865 |
135
- | whisper-large-v3-turbo | 809 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 4 | 128 | 51865 |
 
123
 
124
  | Model Type | Parameters | n_audio_ctx | n_audio_state | n_audio_head | n_audio_layer | n_text_ctx | n_text_state | n_text_head | n_text_layer | n_mels | n_vocab |
125
  |---------------------------|------------|-------------|---------------|--------------|---------------|------------|--------------|-------------|--------------|--------|---------|
126
+ | whisper-tiny | 39 M | 1500 | 384 | 6 | 4 | 224 | 384 | 6 | 4 | 80 | 51865 |
127
+ | whisper-base | 74 M | 1500 | 512 | 8 | 6 | 224 | 512 | 8 | 6 | 80 | 51865 |
128
+ | **whisper-small** | 244 M | 1500 | 768 | 12 | 12 | 224 | 768 | 12 | 12 | 80 | 51865 |
129
+ | whisper-medium | 769 M | 1500 | 1024 | 16 | 24 | 224 | 1024 | 16 | 16 | 80 | 51865 |
130
+ | whisper-large-v1 | 1550 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 20 | 80 | 51865 |
131
+ | whisper-large-v2 | 1550 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 20 | 80 | 51865 |
132
+ | distil-whisper-large-v2 | 756 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 2 | 80 | 51865 |
133
+ | whisper-large-v3 | 1550 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 20 | 128 | 51866 |
134
+ | distil-whisper-large-v3 | 756 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 2 | 128 | 51866 |
135
+ | whisper-large-v3-turbo | 809 M | 1500 | 1280 | 20 | 32 | 224 | 1280 | 20 | 4 | 128 | 51866 |