medgemma-4b-it-sft-lora
This model is a fine-tuned version of unsloth/medgemma-4b-it-unsloth-bnb-4bit on the imagefolder dataset. It achieves the following results on the evaluation set:
- Loss: 0.0574
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 8
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 2
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 1.7993 | 0.0333 | 2 | 6.9157 |
| 1.6601 | 0.0667 | 4 | 5.8620 |
| 1.4103 | 0.1 | 6 | 5.0971 |
| 1.2285 | 0.1333 | 8 | 4.4551 |
| 1.0764 | 0.1667 | 10 | 3.8693 |
| 0.9156 | 0.2 | 12 | 3.2681 |
| 0.7674 | 0.2333 | 14 | 2.6737 |
| 0.6259 | 0.2667 | 16 | 2.2148 |
| 0.5128 | 0.3 | 18 | 1.7809 |
| 0.4205 | 0.3333 | 20 | 1.4785 |
| 0.3432 | 0.3667 | 22 | 1.2499 |
| 0.3018 | 0.4 | 24 | 1.0306 |
| 0.2419 | 0.4333 | 26 | 0.8281 |
| 0.1976 | 0.4667 | 28 | 0.6838 |
| 0.1637 | 0.5 | 30 | 0.6041 |
| 0.1464 | 0.5333 | 32 | 0.5712 |
| 0.1387 | 0.5667 | 34 | 0.5351 |
| 0.1297 | 0.6 | 36 | 0.4897 |
| 0.1176 | 0.6333 | 38 | 0.4398 |
| 0.1049 | 0.6667 | 40 | 0.3819 |
| 0.0901 | 0.7 | 42 | 0.3173 |
| 0.0766 | 0.7333 | 44 | 0.3057 |
| 0.0776 | 0.7667 | 46 | 0.3289 |
| 0.0845 | 0.8 | 48 | 0.3138 |
| 0.0757 | 0.8333 | 50 | 0.2743 |
| 0.0655 | 0.8667 | 52 | 0.2402 |
| 0.0595 | 0.9 | 54 | 0.2269 |
| 0.0558 | 0.9333 | 56 | 0.2150 |
| 0.0518 | 0.9667 | 58 | 0.1877 |
| 0.0445 | 1.0 | 60 | 0.1518 |
| 0.0364 | 1.0333 | 62 | 0.1202 |
| 0.0279 | 1.0667 | 64 | 0.0921 |
| 0.0212 | 1.1 | 66 | 0.0718 |
| 0.0211 | 1.1333 | 68 | 0.0730 |
| 0.0205 | 1.1667 | 70 | 0.0773 |
| 0.0195 | 1.2 | 72 | 0.0687 |
| 0.016 | 1.2333 | 74 | 0.0623 |
| 0.0153 | 1.2667 | 76 | 0.0655 |
| 0.0166 | 1.3 | 78 | 0.0680 |
| 0.016 | 1.3333 | 80 | 0.0654 |
| 0.0165 | 1.3667 | 82 | 0.0614 |
| 0.0143 | 1.4 | 84 | 0.0608 |
| 0.0151 | 1.4333 | 86 | 0.0612 |
| 0.0168 | 1.4667 | 88 | 0.0604 |
| 0.0153 | 1.5 | 90 | 0.0591 |
| 0.0147 | 1.5333 | 92 | 0.0585 |
| 0.0155 | 1.5667 | 94 | 0.0590 |
| 0.0148 | 1.6 | 96 | 0.0596 |
| 0.0149 | 1.6333 | 98 | 0.0603 |
| 0.0148 | 1.6667 | 100 | 0.0604 |
| 0.0149 | 1.7 | 102 | 0.0601 |
| 0.0148 | 1.7333 | 104 | 0.0594 |
| 0.0146 | 1.7667 | 106 | 0.0586 |
| 0.014 | 1.8 | 108 | 0.0579 |
| 0.0158 | 1.8333 | 110 | 0.0576 |
| 0.0148 | 1.8667 | 112 | 0.0574 |
| 0.0161 | 1.9 | 114 | 0.0574 |
| 0.015 | 1.9333 | 116 | 0.0574 |
| 0.0151 | 1.9667 | 118 | 0.0573 |
| 0.0154 | 2.0 | 120 | 0.0574 |
Framework versions
- PEFT 0.16.0
- Transformers 4.57.1
- Pytorch 2.6.0+cu124
- Datasets 4.3.0
- Tokenizers 0.22.1
- Downloads last month
- 59
Model tree for BlacqTangent/medgemma-4b-it-sft-lora
Base model
google/gemma-3-4b-pt
Finetuned
google/medgemma-4b-pt
Finetuned
google/medgemma-4b-it
Quantized
unsloth/medgemma-4b-it-unsloth-bnb-4bit