Commit
·
242e329
1
Parent(s):
b4bbf24
Update README.md
Browse files
README.md
CHANGED
@@ -14,6 +14,7 @@ tags:
|
|
14 |
- summarization
|
15 |
- translation
|
16 |
- question-answering
|
|
|
17 |
---
|
18 |
## Extend vocabulary and Pretrain
|
19 |
We utilized [SentencePiece](https://github.com/google/sentencepiece) to retrain a tokenizer for Vietnamese, English, and Chinese. This newly trained tokenizer's vocabulary was then combined with Flan-T5's original vocabulary, eliminating any duplicate tokens. The resulting merged vocabulary consists of 106611 tokens.
|
|
|
14 |
- summarization
|
15 |
- translation
|
16 |
- question-answering
|
17 |
+
pipeline_tag: fill-mask
|
18 |
---
|
19 |
## Extend vocabulary and Pretrain
|
20 |
We utilized [SentencePiece](https://github.com/google/sentencepiece) to retrain a tokenizer for Vietnamese, English, and Chinese. This newly trained tokenizer's vocabulary was then combined with Flan-T5's original vocabulary, eliminating any duplicate tokens. The resulting merged vocabulary consists of 106611 tokens.
|