SinclairWang
commited on
Commit
•
9c25c22
1
Parent(s):
7a4cd37
Update README.md
Browse files
README.md
CHANGED
@@ -22,9 +22,9 @@ The details are available at [Github:FS-ABSA](https://github.com/nustm/fs-absa)
|
|
22 |
|
23 |
# Model Description
|
24 |
|
25 |
-
To bridge the domain gap
|
26 |
-
i.e., continuing pre-training the language model (i.e., mT5-small) on the unlabeled corpus of the domain of interest (i.e., `restaurant`) with the *text-infilling objective*
|
27 |
-
(corruption rate of 15% and average span length of 1). We collect relevant 100k unlabeled reviews from Yelp for the restaurant domain then translate them into
|
28 |
For pre-training, we employ the [Adafactor](https://arxiv.org/abs/1804.04235) optimizer with a batch size of 16 and a constant learning rate of 1e-4.
|
29 |
|
30 |
Our model can be seen as an enhanced T5 model in the restaurant domain, which can be used for various NLP tasks related to the restaurant domain,
|
|
|
22 |
|
23 |
# Model Description
|
24 |
|
25 |
+
To bridge the domain (and lingual) gap between general pre-training and the task of interest in a specific domain (i.e., `restaurant` in this repo), we conducted *domain-adaptive pre-training*,
|
26 |
+
i.e., continuing pre-training the language model (i.e., mT5-small) on the unlabeled corpus of the domain (and lingual) of interest (i.e., `restaurant`) with the *text-infilling objective*
|
27 |
+
(corruption rate of 15% and average span length of 1). We collect relevant 100k unlabeled reviews from Yelp for the restaurant domain and then translate them into Dutch with the DeepL translator.
|
28 |
For pre-training, we employ the [Adafactor](https://arxiv.org/abs/1804.04235) optimizer with a batch size of 16 and a constant learning rate of 1e-4.
|
29 |
|
30 |
Our model can be seen as an enhanced T5 model in the restaurant domain, which can be used for various NLP tasks related to the restaurant domain,
|