Update README.md
Browse files
README.md
CHANGED
@@ -15,6 +15,7 @@ tags:
|
|
15 |
* Model Size: 134M
|
16 |
* Context Window Size: 768
|
17 |
|
|
|
18 |
> ArabianGPT is a custom-trained version of the GPT-2 base model, specifically tailored for the Arabic language. It is designed to understand and generate Arabic text, making it suitable for various natural language processing tasks in Arabic.
|
19 |
|
20 |
# Training
|
@@ -25,7 +26,9 @@ tags:
|
|
25 |
* Number of Parameters : 134 M Params
|
26 |
* Steps: 337,500
|
27 |
* Loss: 3.97
|
28 |
-
|
|
|
|
|
29 |
|
30 |
# Tokenizer
|
31 |
Type: Custom trained SentencePiece tokenizer
|
@@ -36,7 +39,6 @@ Vocabulary Size: 64K
|
|
36 |
More info about AraNizer can be found here [Link](https://github.com/omarnj-lab/aranizer/tree/main)
|
37 |
|
38 |
|
39 |
-
|
40 |
# Usage
|
41 |
ArabianGPT can be used for text generation tasks in Arabic.
|
42 |
|
@@ -55,12 +57,14 @@ pipe.predict(text)
|
|
55 |
```
|
56 |
|
57 |
# Limitations
|
58 |
-
|
|
|
59 |
|
60 |
# Ethical Considerations
|
61 |
We emphasize responsible usage of ArabianGPT. Users should ensure that the generated text is used ethically and does not propagate misinformation or harmful content.
|
62 |
|
63 |
# Citation
|
|
|
64 |
If you use ArabianGPT in your research or application, please cite it as follows:
|
65 |
|
66 |
```
|
@@ -72,7 +76,7 @@ If you use ArabianGPT in your research or application, please cite it as follows
|
|
72 |
}
|
73 |
```
|
74 |
# Acknowledgments
|
75 |
-
We thank Prince Sultan University, especially the Robotics and Internet of Things Lab, for their support.
|
76 |
|
77 |
# Contact
|
78 |
For inquiries regarding ArabianGPT, please contact onajar@psu.edu.sa.
|
|
|
15 |
* Model Size: 134M
|
16 |
* Context Window Size: 768
|
17 |
|
18 |
+
> [! NOTE]
|
19 |
> ArabianGPT is a custom-trained version of the GPT-2 base model, specifically tailored for the Arabic language. It is designed to understand and generate Arabic text, making it suitable for various natural language processing tasks in Arabic.
|
20 |
|
21 |
# Training
|
|
|
26 |
* Number of Parameters : 134 M Params
|
27 |
* Steps: 337,500
|
28 |
* Loss: 3.97
|
29 |
+
|
30 |
+
> [!NOTE]
|
31 |
+
> The model was trained on the Abu Elkhiar dataset, a comprehensive Arabic text corpus encompassing a wide range of topics. The training process focused on adapting the model to understand the nuances and complexities of the Arabic language.
|
32 |
|
33 |
# Tokenizer
|
34 |
Type: Custom trained SentencePiece tokenizer
|
|
|
39 |
More info about AraNizer can be found here [Link](https://github.com/omarnj-lab/aranizer/tree/main)
|
40 |
|
41 |
|
|
|
42 |
# Usage
|
43 |
ArabianGPT can be used for text generation tasks in Arabic.
|
44 |
|
|
|
57 |
```
|
58 |
|
59 |
# Limitations
|
60 |
+
> [!TIP]
|
61 |
+
> As with any language model, ArabianGPT may have limitations in understanding context or generating text in certain scenarios. Users should be aware of these limitations and use the model accordingly.
|
62 |
|
63 |
# Ethical Considerations
|
64 |
We emphasize responsible usage of ArabianGPT. Users should ensure that the generated text is used ethically and does not propagate misinformation or harmful content.
|
65 |
|
66 |
# Citation
|
67 |
+
|
68 |
If you use ArabianGPT in your research or application, please cite it as follows:
|
69 |
|
70 |
```
|
|
|
76 |
}
|
77 |
```
|
78 |
# Acknowledgments
|
79 |
+
> We thank Prince Sultan University, especially the Robotics and Internet of Things Lab, for their support.
|
80 |
|
81 |
# Contact
|
82 |
For inquiries regarding ArabianGPT, please contact onajar@psu.edu.sa.
|