daqc
/

kuntur-peru-legal-es-gemma-2b-it

+---
+license: gemma
+library_name: peft
+tags:
+- trl
+- sft
+- generated_from_trainer
+base_model: google/gemma-2b-it
+datasets:
+- generator
+model-index:
+- name: kuntur-peru-legal-es-gemma-2b-it
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# kuntur-peru-legal-es-gemma-2b-it
+This model is a fine-tuned version of [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) on the generator dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.1387
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2.5e-05
+- train_batch_size: 4
+- eval_batch_size: 1
+- seed: 66
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 50
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 3.7041        | 0.51  | 50   | 3.6704          |
+| 2.5585        | 1.02  | 100  | 2.5245          |
+| 1.8723        | 1.53  | 150  | 1.9012          |
+| 1.697         | 2.05  | 200  | 1.6294          |
+| 1.5123        | 2.56  | 250  | 1.5092          |
+| 1.3844        | 3.07  | 300  | 1.4406          |
+| 1.4082        | 3.58  | 350  | 1.3942          |
+| 1.3473        | 4.09  | 400  | 1.3614          |
+| 1.2698        | 4.6   | 450  | 1.3338          |
+| 1.3179        | 5.12  | 500  | 1.3127          |
+| 1.2776        | 5.63  | 550  | 1.2942          |
+| 1.2529        | 6.14  | 600  | 1.2781          |
+| 1.2148        | 6.65  | 650  | 1.2667          |
+| 1.2378        | 7.16  | 700  | 1.2538          |
+| 1.1976        | 7.67  | 750  | 1.2418          |
+| 1.2107        | 8.18  | 800  | 1.2325          |
+| 1.199         | 8.7   | 850  | 1.2216          |
+| 1.1498        | 9.21  | 900  | 1.2149          |
+| 1.1788        | 9.72  | 950  | 1.2059          |
+| 1.0873        | 10.23 | 1000 | 1.1995          |
+| 1.1124        | 10.74 | 1050 | 1.1912          |
+| 1.1161        | 11.25 | 1100 | 1.1858          |
+| 1.1408        | 11.76 | 1150 | 1.1782          |
+| 1.083         | 12.28 | 1200 | 1.1735          |
+| 1.1234        | 12.79 | 1250 | 1.1659          |
+| 1.1065        | 13.3  | 1300 | 1.1609          |
+| 1.112         | 13.81 | 1350 | 1.1555          |
+| 1.0759        | 14.32 | 1400 | 1.1513          |
+| 1.0783        | 14.83 | 1450 | 1.1462          |
+| 1.0466        | 15.35 | 1500 | 1.1455          |
+| 1.0334        | 15.86 | 1550 | 1.1424          |
+| 1.045         | 16.37 | 1600 | 1.1405          |
+| 1.016         | 16.88 | 1650 | 1.1393          |
+| 1.0449        | 17.39 | 1700 | 1.1371          |
+| 1.0642        | 17.9  | 1750 | 1.1338          |
+| 1.0276        | 18.41 | 1800 | 1.1340          |
+| 1.0328        | 18.93 | 1850 | 1.1313          |
+| 1.0232        | 19.44 | 1900 | 1.1326          |
+| 1.0588        | 19.95 | 1950 | 1.1284          |
+| 0.9971        | 20.46 | 2000 | 1.1298          |
+| 1.0561        | 20.97 | 2050 | 1.1269          |
+| 1.0714        | 21.48 | 2100 | 1.1279          |
+| 1.0358        | 21.99 | 2150 | 1.1270          |
+| 0.9744        | 22.51 | 2200 | 1.1274          |
+| 1.0019        | 23.02 | 2250 | 1.1275          |
+| 0.9362        | 23.53 | 2300 | 1.1258          |
+| 1.0143        | 24.04 | 2350 | 1.1254          |
+| 1.009         | 24.55 | 2400 | 1.1290          |
+| 0.9969        | 25.06 | 2450 | 1.1253          |
+| 0.8828        | 25.58 | 2500 | 1.1256          |
+| 1.022         | 26.09 | 2550 | 1.1257          |
+| 0.9804        | 26.6  | 2600 | 1.1265          |
+| 0.9851        | 27.11 | 2650 | 1.1276          |
+| 0.9617        | 27.62 | 2700 | 1.1265          |
+| 0.9346        | 28.13 | 2750 | 1.1263          |
+| 0.9552        | 28.64 | 2800 | 1.1258          |
+| 0.9376        | 29.16 | 2850 | 1.1287          |
+| 0.9359        | 29.67 | 2900 | 1.1262          |
+| 0.9447        | 30.18 | 2950 | 1.1271          |
+| 0.9646        | 30.69 | 3000 | 1.1278          |
+| 0.926         | 31.2  | 3050 | 1.1293          |
+| 0.9456        | 31.71 | 3100 | 1.1293          |
+| 0.9223        | 32.23 | 3150 | 1.1296          |
+| 0.9589        | 32.74 | 3200 | 1.1278          |
+| 1.0145        | 33.25 | 3250 | 1.1299          |
+| 0.9315        | 33.76 | 3300 | 1.1292          |
+| 0.8946        | 34.27 | 3350 | 1.1311          |
+| 0.9441        | 34.78 | 3400 | 1.1297          |
+| 0.8996        | 35.29 | 3450 | 1.1317          |
+| 0.9307        | 35.81 | 3500 | 1.1290          |
+| 0.9005        | 36.32 | 3550 | 1.1329          |
+| 0.9167        | 36.83 | 3600 | 1.1303          |
+| 0.9393        | 37.34 | 3650 | 1.1322          |
+| 0.9658        | 37.85 | 3700 | 1.1313          |
+| 0.9375        | 38.36 | 3750 | 1.1341          |
+| 0.9176        | 38.87 | 3800 | 1.1326          |
+| 0.8982        | 39.39 | 3850 | 1.1351          |
+| 0.9685        | 39.9  | 3900 | 1.1326          |
+| 0.9216        | 40.41 | 3950 | 1.1355          |
+| 0.9542        | 40.92 | 4000 | 1.1342          |
+| 0.8739        | 41.43 | 4050 | 1.1371          |
+| 0.9329        | 41.94 | 4100 | 1.1355          |
+| 0.9335        | 42.46 | 4150 | 1.1354          |
+| 0.8851        | 42.97 | 4200 | 1.1363          |
+| 0.9217        | 43.48 | 4250 | 1.1377          |
+| 0.8794        | 43.99 | 4300 | 1.1363          |
+| 0.9104        | 44.5  | 4350 | 1.1371          |
+| 0.8751        | 45.01 | 4400 | 1.1367          |
+| 0.9157        | 45.52 | 4450 | 1.1377          |
+| 0.8277        | 46.04 | 4500 | 1.1374          |
+| 0.8858        | 46.55 | 4550 | 1.1384          |
+| 0.9195        | 47.06 | 4600 | 1.1378          |
+| 0.925         | 47.57 | 4650 | 1.1383          |
+| 0.9007        | 48.08 | 4700 | 1.1384          |
+| 0.9184        | 48.59 | 4750 | 1.1385          |
+| 0.8798        | 49.1  | 4800 | 1.1385          |
+| 0.8596        | 49.62 | 4850 | 1.1387          |
+### Framework versions
+- PEFT 0.10.0
+- Transformers 4.38.0
+- Pytorch 2.2.2+cu121
+- Datasets 2.18.0
+- Tokenizers 0.15.2

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d307622f9fa62d38aaffa5eeff73b1512e714560201fb9d587698f47dc3648cb
 size 7391688

 version https://git-lfs.github.com/spec/v1
+oid sha256:721f3d284329e2ef3ab27b10facf0116e2331ee597d4de5cb20e5e1382c2be91
 size 7391688