Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,16 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
library_name: transformers
|
6 |
---
|
7 |
+
|
8 |
+
# Mega Masked LM on wikitext-103
|
9 |
+
|
10 |
+
This is the location on the Hugging Face hub for the Mega MLM checkpoint. I trained this model on the `wikitext-103` dataset using standard
|
11 |
+
BERT-style masked LM pretraining using the [original Mega repository](https://github.com/facebookresearch/mega) and uploaded the weights
|
12 |
+
initially to hf.co/mnaylor/mega-wikitext-103. When the implementation of Mega into Hugging Face's `transformers` is finished, the weights here
|
13 |
+
are designed to be used with `MegaForMaskedLM` and are compatible with the other (encoder-based) `MegaFor*` model classes.
|
14 |
+
|
15 |
+
This model uses the RoBERTa base tokenizer since the Mega paper does not implement a specific tokenizer aside from the character-level
|
16 |
+
tokenizer used to illustrate long-sequence performance.
|