metadata

license: apache-2.0
base_model: google/vit-base-patch16-224-in21k
tags:
  - generated_from_trainer
datasets:
  - FastJobs/Visual_Emotional_Analysis
metrics:
  - accuracy
  - precision
  - f1
model-index:
  - name: emotion_classification
    results:
      - task:
          name: Image Classification
          type: image-classification
        dataset:
          name: FastJobs/Visual_Emotional_Analysis
          type: FastJobs/Visual_Emotional_Analysis
          config: FastJobs--Visual_Emotional_Analysis
          split: train
          args: FastJobs--Visual_Emotional_Analysis
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.675
          - name: Precision
            type: precision
            value: 0.6854354001733034
          - name: F1
            type: f1
            value: 0.6750572520063745

Emotion Classification

This model is a fine-tuned version of google/vit-base-patch16-224-in21k on the FastJobs/Visual_Emotional_Analysis dataset.

In theory, the accuracy for a random guess on this dataset is 0.1429.

It achieves the following results on the evaluation set:

Loss: 1.0683
Accuracy: 0.675
Precision: 0.6854
F1: 0.6751

Model description

The Vision Transformer base version trained on ImageNet-21K released by Google. Further details can be found on their repo.

Training and evaluation data

Data Split

Used a 4:1 ratio for training and development sets and a random seed of 42. Also used a seed of 42 for batching the data, completely unrelated lol.

Pre-processing Augmentation

The main pre-processing phase for both training and evaluation includes:

Bilinear interpolation to resize the image to (224, 224, 3) because it uses ImageNet images to train the original model
Normalizing images using a mean and standard deviation of [0.5, 0.5, 0.5] just like the original model

Other than the aforementioned pre-processing, the training set was augmented using:

Random horizontal & vertical flip
Color jitter
Random resized crop

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine_with_restarts
lr_scheduler_warmup_steps: 150
num_epochs: 300

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Precision	F1
2.0804	1.0	10	2.0881	0.1437	0.2313	0.1165
2.0839	2.0	20	2.0846	0.1562	0.1772	0.1250
2.072	3.0	30	2.0786	0.1562	0.1835	0.1251
2.0676	4.0	40	2.0702	0.1562	0.2213	0.1265
2.053	5.0	50	2.0586	0.1625	0.2289	0.1330
2.0346	6.0	60	2.0390	0.1938	0.3508	0.1830
2.0072	7.0	70	2.0080	0.2437	0.3131	0.2285
1.9672	8.0	80	1.9506	0.325	0.3516	0.3209
1.8907	9.0	90	1.8587	0.3438	0.4010	0.3361
1.7841	10.0	100	1.7300	0.3937	0.4617	0.3860
1.6688	11.0	110	1.6084	0.4625	0.4958	0.4402
1.5803	12.0	120	1.5305	0.4875	0.5327	0.4661
1.5069	13.0	130	1.4577	0.5437	0.5171	0.5126
1.4353	14.0	140	1.3955	0.55	0.6004	0.5380
1.3913	15.0	150	1.3353	0.5437	0.6508	0.4995
1.3551	16.0	160	1.2874	0.5563	0.5251	0.5201
1.2889	17.0	170	1.2618	0.5687	0.5829	0.5475
1.2387	18.0	180	1.2455	0.5687	0.5723	0.5587
1.1977	19.0	190	1.2210	0.5875	0.6221	0.5858
1.1447	20.0	200	1.1909	0.6	0.6153	0.5840
1.0959	21.0	210	1.1918	0.5813	0.5896	0.5609
1.0657	22.0	220	1.1343	0.625	0.6352	0.6184
0.9869	23.0	230	1.1309	0.625	0.6549	0.6258
0.9576	24.0	240	1.1071	0.6312	0.6373	0.6280
0.9234	25.0	250	1.1407	0.6312	0.6469	0.6279
0.876	26.0	260	1.2006	0.5625	0.6040	0.5514
0.8969	27.0	270	1.1007	0.6125	0.6290	0.6121
0.8066	28.0	280	1.1208	0.6	0.6650	0.5971
0.7579	29.0	290	1.1328	0.6125	0.6625	0.6035
0.7581	30.0	300	1.1039	0.6125	0.6401	0.6121
0.7164	31.0	310	1.0862	0.65	0.6723	0.6494
0.7075	32.0	320	1.0575	0.65	0.6683	0.6485
0.6655	33.0	330	1.1186	0.6125	0.6483	0.6134
0.5947	34.0	340	1.1133	0.625	0.6439	0.6272
0.5813	35.0	350	1.1071	0.6312	0.6735	0.6337
0.6322	36.0	360	1.0839	0.6312	0.6591	0.6324
0.561	37.0	370	1.1040	0.625	0.6425	0.6220
0.558	38.0	380	1.0727	0.6125	0.6255	0.6112
0.5372	39.0	390	1.1417	0.6312	0.6545	0.6292
0.5146	40.0	400	1.0967	0.6312	0.6645	0.6285
0.4968	41.0	410	1.1187	0.6312	0.6543	0.6316
0.4593	42.0	420	1.0683	0.675	0.6854	0.6751
0.4392	43.0	430	1.0937	0.6375	0.6481	0.6374
0.4503	44.0	440	1.1320	0.625	0.6536	0.6255
0.3918	45.0	450	1.1218	0.6312	0.6464	0.6312
0.4236	46.0	460	1.2074	0.5938	0.6188	0.5911
0.3858	47.0	470	1.1769	0.5813	0.6106	0.5809
0.392	48.0	480	1.1572	0.625	0.6381	0.6216
0.3708	49.0	490	1.2293	0.6	0.6388	0.5953
0.3346	50.0	500	1.2205	0.5938	0.6188	0.5943
0.3831	51.0	510	1.2875	0.5875	0.5982	0.5845
0.4161	52.0	520	1.2355	0.5938	0.6421	0.5799
0.3736	53.0	530	1.2361	0.6062	0.6301	0.6006
0.3278	54.0	540	1.1670	0.6312	0.6520	0.6286
0.3295	55.0	550	1.1807	0.6438	0.6712	0.6457
0.3357	56.0	560	1.2007	0.625	0.6279	0.6239
0.3169	57.0	570	1.2314	0.5938	0.6257	0.5942
0.3193	58.0	580	1.2068	0.6188	0.6397	0.6208
0.3128	59.0	590	1.2753	0.5875	0.5919	0.5760
0.3077	60.0	600	1.2154	0.625	0.6432	0.6238
0.2751	61.0	610	1.2596	0.6125	0.6216	0.6099
0.2921	62.0	620	1.2716	0.6188	0.6467	0.6189
0.2939	63.0	630	1.2213	0.625	0.6350	0.6264
0.2732	64.0	640	1.3456	0.5938	0.6189	0.5897
0.2806	65.0	650	1.2491	0.6188	0.6393	0.6162
0.2453	66.0	660	1.2312	0.6188	0.6465	0.6195
0.3077	67.0	670	1.2356	0.6375	0.6564	0.6373

Framework versions

Transformers 4.33.0
Pytorch 2.0.0
Datasets 2.1.0
Tokenizers 0.13.3