superaidesu's picture
Update README.md
b640385
|
raw
history blame
562 Bytes
metadata
license: llama2
language:
  - zh
metrics:
  - bleu
  - chrf

Base model: https://huggingface.co/indiejoseph/cantonese-llama-2-7b-oasst-v1

Finetuned following ALMA (https://github.com/fe1ixxu/ALMA) on the Cantonese-Mandarin translation task.

Finetuning dataset: Sourced from the released raw dataset in https://github.com/meganndare/cantonese-nlp

As the base model was already finetuned on Cantonese monolingual data, we only conducted finetuning on parallel sentences.

Results:

Man -> Can: 35.371 BLEU, 26.197 ChrF++ Can -> Man: 36.553 BLEU, 27.471 ChrF++