File size: 3,534 Bytes
46e9edc 97d86d8 46e9edc 97d86d8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
---
license: apache-2.0
language:
- ar
tags:
- ArabianGPT
widget:
- text: "أعلنت وزارة الحج في المملكة العربية السعودية"
example_title: "مثال ١"
- text: "يبدو اليوم جميلا، سأقوم بتحضير"
example_title: "مثال ٢"
- text: "إن التقنيات الحديثة"
example_title: "مثال ٣"
---
# ArabianGPT Model Overview
## Disclaimer for the Use of Large Language Models (LLMs) for Text Generation
<p style="color: red;">We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT-0.8B, and users engage with and apply the model's outputs at their own risk.</p>
> **Important Note:** Currently, we offer a raw pre-trained model. Our team is actively working on releasing instruction-based LLMs that are fine-tuned and augmented with LRHF. The first set of pre-trained models has been made available for community exploration. While we do have models fine-tuned for specific tasks such as summarization and sentiment analysis, they are still in the development phase.
## How you can use this Pre-Trained?
You are invited to utilize this pre-trained, native Arabic language model as an experimental tool to assess its capabilities, aid in its fine-tuning, and evaluate its performance across a variety of downstream tasks. We encourage you to review our technical report for a comprehensive understanding of the model's performance metrics and the specific downstream tasks it has been tested on. This will provide valuable insights into its applicability and effectiveness in diverse applications.
## Introduction
ArabianGPT-0.8B, part of the ArabianLLM initiatives, is a specialized GPT model optimized for the Arabic language. Developed at Prince Sultan University's Robotics and Internet of Things Lab, this model is a leap forward in natural language modeling and generation for Arabic, tackling the language's unique challenges.
## Key Features
- **Architecture**: GPT-2
- **Model Size**: 0.8 billion parameters
- **Layers**: 36
- **Model Attention Layers (MAL)**: 20
- **Context Window Size**: 1024 tokens
## Training
- **Dataset**: Scraped texts contains scientific articles, and general texts
- **Data Size**: 117 GB
- **Tokenizer**: Aranizer 64K
- **Tokens**: Over 14 billion
- **Hardware**: 5 NDIVIA A100 GPUs
- **Performance**: loss of 3.6
## Role in ArabianLLM Initiatives
ArabianGPT-0.8B is crucial for advancing Arabic language processing, addressing challenges unique to Arabic morphology and dialects.
## Usage
Suitable for Arabic text generation tasks. Example usage with Transformers Pipeline:
```python
from transformers import pipeline
pipe = pipeline("text-generation", model="riotu-lab/ArabianGPT-08B", max_new_tokens=1024)
text = ''
pipe(text)
```
## Limitations and Ethical Considerations
- The model may have context understanding or text generation limitations in certain scenarios.
- Emphasis on ethical use to prevent misinformation or harmful content propagation.
## Acknowledgments
Special thanks to Prince Sultan University, particularly the Robotics and Internet of Things Lab.
## Contact Information
For inquiries: [riotu@psu.edu.sa](mailto:riotu@psu.edu.sa).
## Disclaimer for the Use of Large Language Models (LLMs) for Text Generation
<p style="color: red;">We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT-0.3B, and users engage with and apply the model's outputs at their own risk.</p>
|