Migrate model card from transformers-repo
Browse filesRead announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/bayartsogt/albert-mongolian/README.md
README.md
ADDED
@@ -0,0 +1,59 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: mn
|
3 |
+
---
|
4 |
+
|
5 |
+
# ALBERT-Mongolian
|
6 |
+
[pretraining repo link](https://github.com/bayartsogt-ya/albert-mongolian)
|
7 |
+
## Model description
|
8 |
+
Here we provide pretrained ALBERT model and trained SentencePiece model for Mongolia text. Training data is the Mongolian wikipedia corpus from Wikipedia Downloads and Mongolian News corpus.
|
9 |
+
|
10 |
+
## Evaluation Result:
|
11 |
+
```
|
12 |
+
loss = 1.7478163
|
13 |
+
masked_lm_accuracy = 0.6838185
|
14 |
+
masked_lm_loss = 1.6687671
|
15 |
+
sentence_order_accuracy = 0.998125
|
16 |
+
sentence_order_loss = 0.007942731
|
17 |
+
```
|
18 |
+
|
19 |
+
## Fine-tuning Result on Eduge Dataset:
|
20 |
+
```
|
21 |
+
precision recall f1-score support
|
22 |
+
|
23 |
+
байгал орчин 0.83 0.76 0.80 483
|
24 |
+
боловсрол 0.79 0.75 0.77 420
|
25 |
+
спорт 0.98 0.96 0.97 1391
|
26 |
+
технологи 0.85 0.83 0.84 543
|
27 |
+
улс төр 0.88 0.87 0.87 1336
|
28 |
+
урлаг соёл 0.89 0.94 0.91 726
|
29 |
+
хууль 0.87 0.83 0.85 840
|
30 |
+
эдийн засаг 0.80 0.84 0.82 1265
|
31 |
+
эрүүл мэнд 0.84 0.90 0.87 562
|
32 |
+
|
33 |
+
accuracy 0.87 7566
|
34 |
+
macro avg 0.86 0.85 0.86 7566
|
35 |
+
weighted avg 0.87 0.87 0.87 7566
|
36 |
+
```
|
37 |
+
|
38 |
+
## Reference
|
39 |
+
1. [ALBERT - official repo](https://github.com/google-research/albert)
|
40 |
+
2. [WikiExtrator](https://github.com/attardi/wikiextractor)
|
41 |
+
3. [Mongolian BERT](https://github.com/tugstugi/mongolian-bert)
|
42 |
+
4. [ALBERT - Japanese](https://github.com/alinear-corp/albert-japanese)
|
43 |
+
5. [Mongolian Text Classification](https://github.com/sharavsambuu/mongolian-text-classification)
|
44 |
+
6. [You's paper](https://arxiv.org/abs/1904.00962)
|
45 |
+
|
46 |
+
## Citation
|
47 |
+
```
|
48 |
+
@misc{albert-mongolian,
|
49 |
+
author = {Bayartsogt Yadamsuren},
|
50 |
+
title = {ALBERT Pretrained Model on Mongolian Datasets},
|
51 |
+
year = {2020},
|
52 |
+
publisher = {GitHub},
|
53 |
+
journal = {GitHub repository},
|
54 |
+
howpublished = {\url{https://github.com/bayartsogt-ya/albert-mongolian/}}
|
55 |
+
}
|
56 |
+
```
|
57 |
+
|
58 |
+
## For More Information
|
59 |
+
Please contact by bayartsogtyadamsuren@icloud.com
|