azizbarank
commited on
Commit
•
810352b
1
Parent(s):
2f9c5e7
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,20 @@
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
4 |
+
## The T5 base model for the Czech Language
|
5 |
+
This is the t5 base model for the Czech language that is based on the smaller version of the google/mt5-base model (https://huggingface.co/google/mt5-base).
|
6 |
+
To make this model, I retained only the Czech and some of the English embeddings from the original multilingual model.
|
7 |
+
# Modifications to the original multilingual t5 base model:
|
8 |
+
1- Parameters of the original model were reduced from 582M to 244M parameters.
|
9 |
+
|
10 |
+
2- By choosing the top 20K Czech and 10K English tokens, sentencepiece vocabulary was shrinked from 250K to 30K tokens.
|
11 |
+
|
12 |
+
3- The original size was reduced from 2.2GB to 0.9GB.
|
13 |
+
|
14 |
+
Notes:
|
15 |
+
|
16 |
+
Since this is the base t5 model of the Czech language, before using it for any downstream tasks, it needs to be finetuned with appropriate datasets in the first place.
|
17 |
+
|
18 |
+
References:
|
19 |
+
|
20 |
+
The substantial amount of this work to create this model is mostly based on the the post written by David Dale: "How to adapt a multilingual T5 model for a single language" (https://towardsdatascience.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90)
|