bge-m3-preference-classifier

This model is a fine-tuned version of BAAI/bge-m3 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.3983
Accuracy: 0.775
Precision: 0.8090
Recall: 0.72
F1: 0.7619

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 12500

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Precision	Recall	F1
0.5005	0.0160	200	0.5641	0.687	0.6423	0.844	0.7295
0.4211	0.0321	400	0.5017	0.721	0.7214	0.72	0.7207
0.4291	0.0481	600	0.4614	0.743	0.7042	0.838	0.7653
0.3761	0.0641	800	0.5737	0.734	0.8703	0.55	0.6740
0.3422	0.0801	1000	0.5178	0.737	0.7627	0.688	0.7234
0.3503	0.0962	1200	0.4898	0.744	0.8005	0.65	0.7174
0.3623	0.1122	1400	0.4514	0.738	0.7668	0.684	0.7230
0.3249	0.1282	1600	0.5200	0.732	0.7239	0.75	0.7367
0.3559	0.1443	1800	0.4797	0.742	0.7138	0.808	0.7580
0.3371	0.1603	2000	0.4597	0.761	0.8028	0.692	0.7433
0.329	0.1763	2200	0.4632	0.74	0.8409	0.592	0.6948
0.3257	0.1924	2400	0.4109	0.77	0.7411	0.83	0.7830
0.3606	0.2084	2600	0.4325	0.745	0.7740	0.692	0.7307
0.3369	0.2244	2800	0.4505	0.761	0.7171	0.862	0.7829
0.3399	0.2404	3000	0.4913	0.747	0.7163	0.818	0.7638
0.3421	0.2565	3200	0.4165	0.751	0.7363	0.782	0.7585
0.321	0.2725	3400	0.4502	0.753	0.7391	0.782	0.7600
0.3229	0.2885	3600	0.4042	0.766	0.7367	0.828	0.7797
0.3384	0.3046	3800	0.4186	0.754	0.8757	0.592	0.7064
0.3026	0.3206	4000	0.4198	0.761	0.7843	0.72	0.7508
0.3323	0.3366	4200	0.3999	0.769	0.8064	0.708	0.7540
0.3185	0.3526	4400	0.4058	0.778	0.7643	0.804	0.7836
0.3563	0.3687	4600	0.4410	0.744	0.8333	0.61	0.7044
0.3329	0.3847	4800	0.4047	0.761	0.7906	0.71	0.7482
0.3027	0.4007	5000	0.4064	0.764	0.8284	0.666	0.7384
0.3042	0.4168	5200	0.4035	0.765	0.7227	0.86	0.7854
0.3085	0.4328	5400	0.4235	0.766	0.8260	0.674	0.7423
0.3309	0.4488	5600	0.4143	0.761	0.8129	0.678	0.7394
0.3201	0.4649	5800	0.4089	0.766	0.8065	0.7	0.7495
0.3145	0.4809	6000	0.4110	0.775	0.8590	0.658	0.7452
0.3109	0.4969	6200	0.4067	0.766	0.7982	0.712	0.7526
0.3269	0.5129	6400	0.4038	0.771	0.8173	0.698	0.7530
0.3271	0.5290	6600	0.4098	0.773	0.8740	0.638	0.7376
0.3387	0.5450	6800	0.4164	0.763	0.7476	0.794	0.7701
0.2832	0.5610	7000	0.4000	0.774	0.7751	0.772	0.7735
0.288	0.5771	7200	0.4003	0.776	0.7974	0.74	0.7676
0.3171	0.5931	7400	0.4015	0.771	0.8249	0.688	0.7503
0.3007	0.6091	7600	0.4099	0.776	0.8366	0.686	0.7538
0.2909	0.6252	7800	0.4090	0.775	0.8571	0.66	0.7458
0.3153	0.6412	8000	0.4135	0.774	0.7585	0.804	0.7806
0.3109	0.6572	8200	0.4157	0.773	0.7650	0.788	0.7764
0.3025	0.6732	8400	0.4030	0.766	0.7930	0.72	0.7547
0.3214	0.6893	8600	0.3960	0.776	0.7924	0.748	0.7695
0.2937	0.7053	8800	0.4014	0.77	0.7922	0.732	0.7609
0.3112	0.7213	9000	0.3981	0.777	0.8085	0.726	0.7650
0.2957	0.7374	9200	0.4131	0.769	0.7905	0.732	0.7601
0.302	0.7534	9400	0.3990	0.773	0.8027	0.724	0.7613
0.3011	0.7694	9600	0.4009	0.777	0.8244	0.704	0.7594
0.32	0.7854	9800	0.4023	0.774	0.8157	0.708	0.7580
0.2845	0.8015	10000	0.4068	0.771	0.8115	0.706	0.7551
0.3174	0.8175	10200	0.4033	0.772	0.8049	0.718	0.7590
0.3287	0.8335	10400	0.3983	0.774	0.8072	0.72	0.7611
0.3018	0.8496	10600	0.3983	0.773	0.8067	0.718	0.7598
0.2962	0.8656	10800	0.3974	0.77	0.8111	0.704	0.7537
0.3279	0.8816	11000	0.3965	0.772	0.8119	0.708	0.7564
0.2978	0.8977	11200	0.3967	0.774	0.8100	0.716	0.7601
0.3142	0.9137	11400	0.3972	0.771	0.8031	0.718	0.7582
0.3202	0.9297	11600	0.3977	0.773	0.8040	0.722	0.7608
0.297	0.9457	11800	0.3984	0.774	0.8072	0.72	0.7611
0.3244	0.9618	12000	0.3982	0.773	0.8040	0.722	0.7608
0.3078	0.9778	12200	0.3986	0.772	0.8049	0.718	0.7590
0.3244	0.9938	12400	0.3983	0.775	0.8090	0.72	0.7619

Evaluation on test split

Accuracy	Precision	Recall	F1-score
0.7530	0.8524	0.6120	0.7125

Framework versions

Transformers 4.43.1
Pytorch 2.3.1+cu121
Datasets 2.20.0
Tokenizers 0.19.1

ryota39
/

bge-m3-preference-classifier

bge-m3-preference-classifier

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Evaluation on test split

Framework versions

Model tree for ryota39/bge-m3-preference-classifier

Evaluation results