r-f commited on
Commit
129d357
1 Parent(s): 574f7d9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -5,7 +5,7 @@ tags:
5
  metrics:
6
  - accuracy
7
  model_index:
8
- name: wav2vec2-lg-xlsr-en-speech-emotion-recognition
9
  ---
10
  # Speech Emotion Recognition By Fine-Tuning Wav2Vec 2.0
11
  The model is a fine-tuned version of [jonatasgrosman/wav2vec2-large-xlsr-53-english](https://huggingface.co/jonatasgrosman/wav2vec2-large-xlsr-53-english) for a Speech Emotion Recognition (SER) task.]
@@ -20,13 +20,13 @@ Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) (https://ze
20
  Toronto emotional speech set (TESS) (https://tspace.library.utoronto.ca/handle/1807/24487)
21
  - 2800 audio files from 2 female actors
22
 
23
- 7 classifcation labels
24
  ```python
25
  emotions = ['angry' 'disgust' 'fear' 'happy' 'neutral' 'sad' 'surprise']
26
  ```
27
  It achieves the following results on the evaluation set:
28
- - Loss: 0.5023
29
- - Accuracy: 0.8223
30
  ## Model description
31
  More information needed
32
  ## Intended uses & limitations
@@ -39,13 +39,13 @@ The following hyperparameters were used during training:
39
  - learning_rate: 0.0001
40
  - train_batch_size: 4
41
  - eval_batch_size: 4
 
42
  - seed: 42
43
  - gradient_accumulation_steps: 2
44
- - total_train_batch_size: 8
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
- - lr_scheduler_type: linear
47
- - num_epochs: 3
48
- - mixed_precision_training: Native AMP
49
 
50
  ### Training results
51
  | Step | Training Loss | Validation Loss | Accuracy |
 
5
  metrics:
6
  - accuracy
7
  model_index:
8
+ name: wav2vec-english-speech-emotion-recognition
9
  ---
10
  # Speech Emotion Recognition By Fine-Tuning Wav2Vec 2.0
11
  The model is a fine-tuned version of [jonatasgrosman/wav2vec2-large-xlsr-53-english](https://huggingface.co/jonatasgrosman/wav2vec2-large-xlsr-53-english) for a Speech Emotion Recognition (SER) task.]
 
20
  Toronto emotional speech set (TESS) (https://tspace.library.utoronto.ca/handle/1807/24487)
21
  - 2800 audio files from 2 female actors
22
 
23
+ 7 labels/emotions were used as classification labels
24
  ```python
25
  emotions = ['angry' 'disgust' 'fear' 'happy' 'neutral' 'sad' 'surprise']
26
  ```
27
  It achieves the following results on the evaluation set:
28
+ - Loss: 0.104075
29
+ - Accuracy: 0.97463
30
  ## Model description
31
  More information needed
32
  ## Intended uses & limitations
 
39
  - learning_rate: 0.0001
40
  - train_batch_size: 4
41
  - eval_batch_size: 4
42
+ - eval_steps: 500
43
  - seed: 42
44
  - gradient_accumulation_steps: 2
 
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
+ - num_epochs: 4
47
+ - max_steps=7500
48
+ - save_steps: 1500
49
 
50
  ### Training results
51
  | Step | Training Loss | Validation Loss | Accuracy |