r-f
/

wav2vec-english-speech-emotion-recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

r-f commited on Sep 24, 2022

Commit

20217d5

•

1 Parent(s): e3f6b70

Create new file

Files changed (1) hide show

README.md +67 -0

README.md ADDED Viewed

	@@ -0,0 +1,67 @@

+---
+license: apache-2.0
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model_index:
+  name: wav2vec2-lg-xlsr-en-speech-emotion-recognition
+---
+# Speech Emotion Recognition By Fine-Tuning Wav2Vec 2.0
+The model is a fine-tuned version of [jonatasgrosman/wav2vec2-large-xlsr-53-english](https://huggingface.co/jonatasgrosman/wav2vec2-large-xlsr-53-english) for a Speech Emotion Recognition (SER) task.]
+Several datasets were used the fine-tune the original model:
+Surrey Audio-Visual Expressed Emotion (SAVEE) (http://kahlan.eps.surrey.ac.uk/savee/Database.html)
+- 480 audio files from 4 male actors
+Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) (https://zenodo.org/record/1188976#.YO6yI-gzaUk)
+- 1440 audio files from 24 professional actors (12 female, 12 male)
+Toronto emotional speech set (TESS) (https://tspace.library.utoronto.ca/handle/1807/24487)
+- 2800 audio files from 2 female actors
+7 classifcation labels
+```python
+emotions = ['angry' 'disgust' 'fear' 'happy' 'neutral' 'sad' 'surprise']
+```
+It achieves the following results on the evaluation set:
+- Loss: 0.5023
+- Accuracy: 0.8223
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 3
+- mixed_precision_training: Native AMP
+### Training results
+Step 	Training Loss 	Validation Loss 	Accuracy
+500 	1.812400 	1.365212 	0.486258
+1000 	0.887200 	0.773145 	0.797040
+1500 	0.703500 	0.574954 	0.852008
+2000 	0.687900 	1.286738 	0.775899
+2500 	0.649800 	0.697455 	0.832981
+3000 	0.569600 	0.337240 	0.892178
+3500 	0.421800 	0.307072 	0.911205
+4000 	0.308800 	0.374443 	0.930233
+4500 	0.268800 	0.260444 	0.936575
+5000 	0.297300 	0.302985 	0.923890
+5500 	0.176500 	0.165439 	0.961945
+6000 	0.147500 	0.170199 	0.961945
+6500 	0.127400 	0.155310 	0.966173
+7000 	0.069900 	0.103882 	0.976744
+7500 	0.083000 	0.104075 	0.974630