NUSTM
/

dutch-restaurant-mt5-small

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

SinclairWang commited on Apr 24, 2023

Commit

9c25c22

•

1 Parent(s): 7a4cd37

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -22,9 +22,9 @@ The details are available at [Github:FS-ABSA](https://github.com/nustm/fs-absa)
 # Model Description
-To bridge the domain gap between general pre-training and the task of interest in a specific domain (i.e., `restaurant` in this repo), we conducted *domain-adaptive pre-training*,
-i.e., continuing pre-training the language model (i.e., mT5-small) on the unlabeled corpus of the domain of interest (i.e., `restaurant`) with the *text-infilling objective*
-(corruption rate of 15% and average span length of 1). We collect relevant 100k unlabeled reviews from Yelp for the restaurant domain then translate them into dutch with the DeepL translator.
 For pre-training, we employ the [Adafactor](https://arxiv.org/abs/1804.04235) optimizer with a batch size of 16 and a constant learning rate of 1e-4.
 Our model can be seen as an enhanced T5 model in the restaurant domain, which can be used for various NLP tasks related to the restaurant domain,

 # Model Description
+To bridge the domain (and lingual) gap  between general pre-training and the task of interest in a specific domain (i.e., `restaurant` in this repo), we conducted *domain-adaptive pre-training*,
+i.e., continuing pre-training the language model (i.e., mT5-small) on the unlabeled corpus of the domain (and lingual) of interest (i.e., `restaurant`) with the *text-infilling objective*
+(corruption rate of 15% and average span length of 1). We collect relevant 100k unlabeled reviews from Yelp for the restaurant domain and then translate them into Dutch with the DeepL translator.
 For pre-training, we employ the [Adafactor](https://arxiv.org/abs/1804.04235) optimizer with a batch size of 16 and a constant learning rate of 1e-4.
 Our model can be seen as an enhanced T5 model in the restaurant domain, which can be used for various NLP tasks related to the restaurant domain,