raptorkwok's picture
Update README.md
1308991 verified
---
tags:
- text2text-generation
metrics:
- bleu
- chrf
model-index:
- name: cantonese-chinese-translation-gen1
results: []
datasets:
- raptorkwok/cantonese-chinese-dataset-gen2
language:
- zh
---
# Cantonese-Written Chinese Translation Model
This model is a fine-tuned version of [fnlp/bart-base-chinese](https://huggingface.co/fnlp/bart-base-chinese) on [Cantonese-Written Chinese Dataset Gen2](https://huggingface.co/raptorkwok/cantonese-chinese-dataset-gen2).
It achieves the following results on the evaluation set:
- Loss: 1.5413
- Bleu: 40.7808
- Chrf: 42.5628
- Gen Len: 13.2556
## Model description
The model is based on BART Chinese model, trained on 1M Cantonese-Written Chinese Parallel Corpus data.
## Intended uses & limitations
Its intended use is to translate Cantonese sentences to Written Chinese accurately.
## Training and evaluation data
Training and evaluation data is provided by the [Cantonese-Written Chinese Dataset Gen2](https://huggingface.co/raptorkwok/cantonese-chinese-dataset-gen2).
## Training procedure
The training was performed using `Seq2SeqTrainer`.
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
- mixed_precision_training: Native AMP
### Training results
| Training Loss | Epoch | Step | Validation Loss | Bleu | Chrf | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|
| 0.2275 | 0.05 | 5000 | 1.5256 | 40.6521 | 42.475 | 13.2277 |
| 0.1752 | 0.1 | 10000 | 1.5413 | 40.7808 | 42.5628 | 13.2556 |
| 0.1533 | 0.15 | 15000 | 1.5938 | 40.7698 | 42.5348 | 13.2678 |
| 0.1442 | 0.2 | 20000 | 1.6487 | 40.6062 | 42.353 | 13.2602 |
| 0.1317 | 0.24 | 25000 | 1.7148 | 40.569 | 42.2753 | 13.2798 |
### Framework versions
- Transformers 4.28.1
- Pytorch 2.3.1+cu121
- Datasets 2.19.1
- Tokenizers 0.13.3