Cedric Lothritz commited on
Commit
c279421
·
1 Parent(s): 51711b8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ ##LuxemBERT
2
+
3
+ LuxemBERT is a BERT model for the Luxembourgish language.
4
+ It was trained using 6.1 million Luxembourgish sentences from various sources including the Luxembourgish Wikipedia, the Leipzig Corpora Collection and rtl.lu.
5
+ In addition, we partially translated 6.1 million sentences from the German Wikipedia from German to Luxembourgish as means of data augmentation. This gave us a dataset of 12.2 million sentences we used to train our LuxemBERT model.
6
+
7
+ If you use our model, please cite our paper:
8
+ [Will be added later]