HamzaNaser
/

Dialects-to-MSA-Transformer

Text2Text Generation

Dialects Conversion

Text Correction

En-Ar Transtaltion

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

HamzaNaser commited on Oct 4

Commit

e1ef5d7

•

1 Parent(s): 6474150

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ model-index:
 # Dialects-to-MSA-Transformer overview
-This Model is optimized to convert written text in various non Standard Classical Arabic into Classic Arabic, the model was Fine-Tuned on 0.8M pairs of sentence generated by OpenAI API gpt-4o-mini Text Generation Model, beside being able to convert Dialects into Classical Arabic, the model can also be used in other NLP tasks such as Text Correction, Diacretization and Sentence Punctuation.
@@ -85,7 +85,7 @@ Inspecting large paris of texts might be tedious, thus we have taken a sample of
 | Data Set Size | GPU Device | Epochs     | Training Time   | Blue Score   |
 |:-------------:|:----------:|:----------:|:---------------:|:------------:|
 | 0.8M          | A100       | 3          | 7.7Hrs          | 46.9         |
-| 3.0M          | A100       | 1          | ???             | ???          |
 ## Costs and Resources
 There are two main computing resources when Dialects to MSA Transformer were built, one is the generation of MSA sequences using GPT model, the second resource is the GPU used to train and adjust the parameters of the pretrained Model.

 # Dialects-to-MSA-Transformer overview
+This Model is optimized to convert written text in various non Standard Classical Arabic into Classic Arabic, the model was Fine-Tuned on 0.8M pairs of sentence generated by OpenAI API gpt-4o-mini Text Generation Model, beside being able to convert Dialects into Classical Arabic, the model can also be used in other NLP tasks such as Text Correction, Diacretization, Sentence Punctuation and Machine Translation.
 | Data Set Size | GPU Device | Epochs     | Training Time   | Blue Score   |
 |:-------------:|:----------:|:----------:|:---------------:|:------------:|
 | 0.8M          | A100       | 3          | 7.7Hrs          | 46.9         |
+| 2.6M          | A100       | 1          | ???             | ???          |
 ## Costs and Resources
 There are two main computing resources when Dialects to MSA Transformer were built, one is the generation of MSA sequences using GPT model, the second resource is the GPU used to train and adjust the parameters of the pretrained Model.