File size: 3,576 Bytes
0bf5ad8 89e2d8c c89ef6e 89e2d8c 0bf5ad8 9338a5e 4e9288e e297086 db47882 ad11ef0 fa3f578 75e978c fb88e49 3ecb28e db6246b 85b6186 2eba4f2 d26a940 77a722e 2353195 4bc582f 307bc63 059751f 6e74f28 ee2aca2 3137f62 f26fabe bf64018 11c9131 2187878 769f7e9 c89ef6e 89e2d8c 0bf5ad8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 |
---
library_name: transformers
license: apache-2.0
base_model: google/mt5-small
tags:
- generated_from_keras_callback
model-index:
- name: pakawadeep/mt5-small-finetuned-ctfl-backtranslation_7k
results: []
---
<!-- This model card has been generated automatically according to the information Keras had access to. You should
probably proofread and complete it, then remove this comment. -->
# pakawadeep/mt5-small-finetuned-ctfl-backtranslation_7k
This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
It achieves the following results on the evaluation set:
- Train Loss: 4.4260
- Validation Loss: 4.1643
- Train Bleu: 0.0
- Train Gen Len: 127.0
- Epoch: 29
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: float32
### Training results
| Train Loss | Validation Loss | Train Bleu | Train Gen Len | Epoch |
|:----------:|:---------------:|:----------:|:-------------:|:-----:|
| 18.4580 | 9.0813 | 0.0013 | 3.0 | 0 |
| 10.6168 | 8.0760 | 0.0008 | 50.0 | 1 |
| 8.9972 | 7.4303 | 0.0003 | 127.0 | 2 |
| 8.2615 | 7.1119 | 0.0010 | 3.0 | 3 |
| 7.8727 | 6.9450 | 0.0 | 2.0 | 4 |
| 7.6314 | 6.8080 | 0.0 | 2.0 | 5 |
| 7.4404 | 6.6029 | 0.0 | 2.0 | 6 |
| 7.2126 | 6.2358 | 0.0003 | 127.0 | 7 |
| 6.9077 | 5.6784 | 0.0003 | 127.0 | 8 |
| 6.5300 | 5.3165 | 0.0003 | 127.0 | 9 |
| 6.2332 | 5.0961 | 0.0003 | 127.0 | 10 |
| 5.9571 | 4.9619 | 0.0003 | 127.0 | 11 |
| 5.7344 | 4.8588 | 0.0 | 7.0 | 12 |
| 5.5347 | 4.7700 | 0.0 | 127.0 | 13 |
| 5.4059 | 4.7055 | 0.0005 | 127.0 | 14 |
| 5.2839 | 4.6485 | 0.0004 | 127.0 | 15 |
| 5.1769 | 4.5958 | 0.0 | 127.0 | 16 |
| 5.0864 | 4.5455 | 0.0 | 127.0 | 17 |
| 5.0025 | 4.5005 | 0.0 | 127.0 | 18 |
| 4.9161 | 4.4589 | 0.6412 | 127.0 | 19 |
| 4.8523 | 4.4206 | 0.8846 | 127.0 | 20 |
| 4.7860 | 4.3832 | 0.0 | 10.0 | 21 |
| 4.7340 | 4.3479 | 0.0 | 10.0 | 22 |
| 4.6707 | 4.3200 | 0.0 | 10.0 | 23 |
| 4.6181 | 4.2951 | 0.0 | 10.0 | 24 |
| 4.5780 | 4.2621 | 0.0 | 14.0 | 25 |
| 4.5309 | 4.2380 | 0.0 | 127.0 | 26 |
| 4.4944 | 4.2097 | 5.1292 | 29.0 | 27 |
| 4.4613 | 4.1865 | 0.0 | 127.0 | 28 |
| 4.4260 | 4.1643 | 0.0 | 127.0 | 29 |
### Framework versions
- Transformers 4.44.2
- TensorFlow 2.17.0
- Datasets 3.0.0
- Tokenizers 0.19.1
|