File size: 682 Bytes
aee566e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
---
language: dv
---
# dv-muril
This is an experiment in transfer learning, to insert Dhivehi word and
word-piece tokens into Google's MuRIL model.
This BERT-based model currently performs better than dv-wave ELECTRA on
the Maldivian News Classification task https://github.com/Sofwath/DhivehiDatasets
## Training
- Start with MuRIL (similar to mBERT) with no Thaana vocabulary
- Based on PanLex dictionaries, attach 1,100 Dhivehi words to Malayalam or English embeddings
- Add remaining words and word-pieces from dv-wave to vocab.txt
- Continue BERT pretraining on a TPU
CoLab notebook:
https://colab.research.google.com/drive/1kPyd60EtOFLaHRVToNS-IKj29T9JCvMq?usp=sharing
|