medsiglip-448-ft-vindr-10ep

This model is a fine-tuned version of google/medsiglip-448 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 8
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 64
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.03
lr_scheduler_warmup_steps: 300
num_epochs: 10

Training Loss	Epoch	Step	Validation Loss
5.0847	0.4938	100	4.0767
2.9526	0.9877	200	3.9932
2.9083	1.4790	300	4.1026
2.8957	1.9728	400	4.1140
2.8337	2.4642	500	4.0660
2.8343	2.9580	600	4.1254
2.7338	3.4494	700	4.0615
2.7297	3.9432	800	4.0960
2.6297	4.4346	900	4.0806
2.6329	4.9284	1000	4.0399
2.5551	5.4198	1100	4.1729
2.5151	5.9136	1200	4.1392
2.4801	6.4049	1300	4.2986
2.4498	6.8988	1400	4.4359
2.4056	7.3901	1500	4.4775
2.4119	7.8840	1600	4.4226
2.3567	8.3753	1700	4.6271
2.3579	8.8691	1800	4.5365
2.3653	9.3605	1900	4.5948
2.3486	9.8543	2000	4.6067

Safetensors

Model size

0.9B params

Tensor type

F32

Base model

Finetuned

(15)

this model