flax-community
/

arabic-t5-small

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

salti commited on Jul 25, 2021

Commit

37e00b7

•

1 Parent(s): 93d26be

Undo readme overwrite mistake

Files changed (1) hide show

README.md +37 -2

README.md CHANGED Viewed

@@ -1,3 +1,38 @@
-The model and logs in this directory are for a faulty run where `dropout_rate` was mistakenly set to `0.1` instead of `0`.
-The model here was trained only for `10'000` steps.

+---
+language:
+  - ar
+datasets:
+  - mc4
+  - oscar
+  - arabic_billion_words
+---
+# arabic-t5-small
+This is a T5v1.1 (small) trained on the concatenation of the Arabic Billion Words corpus and the Arabic subsets of the mC4 and Oscar datasets. The model could only be trained for about `10%` of the whole dataset due to time limitations.
+## Training parameters
+|                       |               |
+| :-------------------: | :-----------: |
+|         steps         |   `22'000`    |
+|  Training batch size  |     `384`     |
+| Evaluation batch size |     `768`     |
+|     learning rate     |    `1e-2`     |
+|         dtype         | `jnp.float32` |
+## Note for finetuning:
+This model was pretrained with dropout turned off, so the default `dropout_rate` in the model config is `0`.
+To finetune the model dropout should be turned be back on, like this:
+```python
+model = T5ForConditionalGeneration.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1)
+```
+or,
+```python
+model = AutoModelForSeq2SeqLM.from_pretrained("flax-community/arabic-t5-small", dropout_rate=0.1)
+```