--- license: llama2 language: - zh metrics: - bleu - chrf --- Base model: https://huggingface.co/indiejoseph/cantonese-llama-2-7b-oasst-v1 Finetuned following ALMA (https://github.com/fe1ixxu/ALMA) on the Cantonese-Mandarin translation task. Finetuning dataset: Sourced from the released raw dataset in https://github.com/meganndare/cantonese-nlp As the base model was already finetuned on Cantonese monolingual data, we only conducted finetuning on parallel sentences. Results: Man -> Can: 35.371 BLEU, 26.197 ChrF++ Can -> Man: 36.553 BLEU, 27.471 ChrF++ Github Repo: https://github.com/cmgao/nlp_project The ALMA code was linked as submodule.