wavlm_finetuned_emodb

This model is a fine-tuned version of microsoft/wavlm-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.9254
Uar: 0.8148
Acc: 0.8529

Model description

This model predict given audio waveform to one of four common emotion categories: anger, happiness, sadness, and neutral

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Uar	Acc
1.3857	0.1538	1	1.3786	0.25	0.1985
1.3322	0.3077	2	1.3549	0.2914	0.2426
1.3112	0.4615	3	1.3165	0.5375	0.6103
1.2981	0.6154	4	1.2905	0.5	0.6029
1.1317	0.7692	5	1.2923	0.4907	0.5956
1.2078	0.9231	6	1.2619	0.5556	0.6471
0.9237	1.0769	7	1.2254	0.5741	0.6618
0.8396	1.2308	8	1.2247	0.5556	0.6471
1.0354	1.3846	9	1.2076	0.5556	0.6471
0.9205	1.5385	10	1.1891	0.5833	0.6691
0.9071	1.6923	11	1.1704	0.6481	0.7206
0.8132	1.8462	12	1.1988	0.6939	0.5735
0.8994	2.0	13	1.1960	0.6574	0.5221
0.7924	2.1538	14	1.1579	0.6658	0.5662
0.7386	2.3077	15	1.1401	0.6944	0.7574
0.6324	2.4615	16	1.1202	0.6111	0.6912
0.7282	2.6154	17	1.1090	0.5833	0.6691
0.673	2.7692	18	1.0907	0.6111	0.6912
0.623	2.9231	19	1.0578	0.7872	0.8235
0.4954	3.0769	20	1.0357	0.8475	0.8676
0.5201	3.2308	21	1.0365	0.7778	0.8235
0.5608	3.3846	22	1.0346	0.75	0.8015
0.6334	3.5385	23	1.0047	0.7685	0.8162
0.3737	3.6923	24	0.9585	0.8658	0.8897
0.5369	3.8462	25	0.9527	0.9178	0.8824
0.3599	4.0	26	0.9682	0.8906	0.8382
0.7642	4.1538	27	0.9418	0.8951	0.8456
0.4882	4.3077	28	0.9095	0.9310	0.9265
0.5011	4.4615	29	0.9378	0.8426	0.875
0.3707	4.6154	30	0.9630	0.7963	0.8382
0.381	4.7692	31	0.9721	0.7870	0.8309
0.2307	4.9231	32	0.9522	0.7963	0.8382
0.2829	5.0769	33	0.9598	0.7870	0.8309
0.2581	5.2308	34	0.9458	0.8056	0.8456
0.4658	5.3846	35	0.9442	0.8148	0.8529
0.2133	5.5385	36	0.9524	0.7870	0.8309
0.1107	5.6923	37	0.9601	0.7870	0.8309
0.3599	5.8462	38	0.9605	0.7778	0.8235
0.3085	6.0	39	0.9522	0.7918	0.8309
0.2739	6.1538	40	0.9564	0.7870	0.8309
0.3279	6.3077	41	0.9582	0.7870	0.8309
0.1346	6.4615	42	0.9646	0.7685	0.8162
0.1429	6.6154	43	0.9695	0.7685	0.8162
0.1	6.7692	44	0.9692	0.7685	0.8162
0.1852	6.9231	45	0.9651	0.7685	0.8162
0.1028	7.0769	46	0.9378	0.8056	0.8456
0.2071	7.2308	47	0.9154	0.8195	0.8529
0.1752	7.3846	48	0.8882	0.8566	0.8824
0.0907	7.5385	49	0.8704	0.8843	0.9044
0.1263	7.6923	50	0.8719	0.8798	0.8971
0.068	7.8462	51	0.8738	0.8798	0.8971
0.0589	8.0	52	0.8881	0.8566	0.8824
0.1494	8.1538	53	0.9001	0.8473	0.875
0.1137	8.3077	54	0.9120	0.8288	0.8603
0.0522	8.4615	55	0.9212	0.8148	0.8529
0.0666	8.6154	56	0.9251	0.8148	0.8529
0.0867	8.7692	57	0.9270	0.8148	0.8529
0.0764	8.9231	58	0.9264	0.8148	0.8529
0.0526	9.0769	59	0.9259	0.8148	0.8529
0.2877	9.2308	60	0.9254	0.8148	0.8529

Framework versions

Transformers 4.40.1
Pytorch 2.3.0+cu121
Datasets 2.19.0
Tokenizers 0.19.1

Bagus
/

wavlm_finetuned_emodb