KMB_SimCSE_test

This model is a fine-tuned version of x2bee/KoModernBERT-base-mlm-v03-retry-ckp03 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0306
  • Pearson Cosine: 0.8211
  • Spearman Cosine: 0.8198
  • Pearson Manhattan: 0.7909
  • Spearman Manhattan: 0.7991
  • Pearson Euclidean: 0.7883
  • Spearman Euclidean: 0.7968
  • Pearson Dot: 0.7578
  • Spearman Dot: 0.7578

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 128
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 4.0

Training results

Training Loss Epoch Step Validation Loss Pearson Cosine Spearman Cosine Pearson Manhattan Spearman Manhattan Pearson Euclidean Spearman Euclidean Pearson Dot Spearman Dot
0.4859 0.1172 250 0.0753 0.7923 0.7923 0.7833 0.7911 0.7825 0.7907 0.6785 0.6757
0.4421 0.2343 500 0.0699 0.7956 0.7989 0.7894 0.7987 0.7887 0.7980 0.6754 0.6702
0.3553 0.3515 750 0.0556 0.8076 0.8088 0.8036 0.8096 0.8024 0.8090 0.7051 0.7031
0.3311 0.4686 1000 0.0558 0.8114 0.8143 0.8050 0.8126 0.8040 0.8118 0.7185 0.7185
0.3541 0.5858 1250 0.0556 0.8070 0.8099 0.8135 0.8183 0.8126 0.8180 0.7040 0.7018
0.344 0.7029 1500 0.0549 0.8153 0.8197 0.8109 0.8202 0.8097 0.8188 0.7054 0.7078
0.3268 0.8201 1750 0.0535 0.8172 0.8210 0.8138 0.8211 0.8128 0.8202 0.7224 0.7208
0.3399 0.9372 2000 0.0569 0.8113 0.8163 0.8073 0.8162 0.8066 0.8152 0.7242 0.7226
0.2473 1.0544 2250 0.0453 0.8124 0.8143 0.8031 0.8103 0.8020 0.8093 0.7271 0.7261
0.2563 1.1715 2500 0.0408 0.8178 0.8195 0.8043 0.8132 0.8032 0.8120 0.7518 0.7504
0.2841 1.2887 2750 0.0437 0.8074 0.8100 0.8063 0.8138 0.8053 0.8130 0.7237 0.7204
0.2462 1.4058 3000 0.0419 0.8164 0.8192 0.8050 0.8143 0.8039 0.8132 0.7395 0.7393
0.2328 1.5230 3250 0.0404 0.8187 0.8203 0.8084 0.8165 0.8070 0.8154 0.7426 0.7414
0.2052 1.6401 3500 0.0390 0.8147 0.8164 0.8045 0.8129 0.8035 0.8122 0.7426 0.7422
0.262 1.7573 3750 0.0419 0.8188 0.8204 0.8080 0.8170 0.8067 0.8158 0.7306 0.7294
0.2269 1.8744 4000 0.0393 0.8218 0.8235 0.8002 0.8112 0.7985 0.8094 0.7384 0.7375
0.2472 1.9916 4250 0.0400 0.8203 0.8224 0.8053 0.8160 0.8040 0.8147 0.7317 0.7308
0.1838 2.1087 4500 0.0348 0.8184 0.8191 0.8023 0.8099 0.8005 0.8085 0.7495 0.7481
0.1509 2.2259 4750 0.0359 0.8117 0.8120 0.7977 0.8054 0.7958 0.8036 0.7344 0.7343
0.1816 2.3430 5000 0.0330 0.8185 0.8181 0.8000 0.8079 0.7978 0.8060 0.7507 0.7501
0.166 2.4602 5250 0.0335 0.8183 0.8188 0.8015 0.8107 0.7997 0.8091 0.7450 0.7445
0.1572 2.5773 5500 0.0352 0.8123 0.8135 0.8021 0.8100 0.8003 0.8084 0.7368 0.7336
0.1353 2.6945 5750 0.0333 0.8210 0.8211 0.8045 0.8123 0.8024 0.8103 0.7463 0.7463
0.1555 2.8116 6000 0.0325 0.8185 0.8183 0.7959 0.8036 0.7939 0.8019 0.7526 0.7538
0.152 2.9288 6250 0.0326 0.8154 0.8151 0.7929 0.8018 0.7908 0.8001 0.7415 0.7427
0.1 3.0459 6500 0.0312 0.8194 0.8190 0.7908 0.7990 0.7886 0.7972 0.7565 0.7571
0.1075 3.1631 6750 0.0318 0.8184 0.8181 0.7949 0.8031 0.7928 0.8016 0.7567 0.7583
0.0971 3.2802 7000 0.0312 0.8183 0.8176 0.7905 0.7992 0.7882 0.7970 0.7561 0.7572
0.12 3.3974 7250 0.0303 0.8237 0.8230 0.7953 0.8035 0.7930 0.8016 0.7683 0.7690
0.1003 3.5145 7500 0.0315 0.8181 0.8172 0.7964 0.8047 0.7941 0.8028 0.7502 0.7505
0.1237 3.6317 7750 0.0308 0.8190 0.8178 0.7915 0.7990 0.7886 0.7969 0.7589 0.7583
0.0991 3.7488 8000 0.0315 0.8186 0.8172 0.7952 0.8024 0.7925 0.8000 0.7540 0.7531
0.1017 3.8660 8250 0.0311 0.8182 0.8174 0.7925 0.8007 0.7900 0.7986 0.7532 0.7523
0.1132 3.9831 8500 0.0306 0.8211 0.8198 0.7909 0.7991 0.7883 0.7968 0.7578 0.7578

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.5.1+cu121
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
19
Safetensors
Model size
184M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Model tree for CocoRoF/KMB_SimCSE_test

Finetuned
(1)
this model