|
# GPT-2 for Tigrinya Language |
|
|
|
This repository contains a GPT-2 model trained from scratch on Tigrinya text data. The model was trained using the Hugging Face Transformers library. |
|
|
|
## Model Details |
|
|
|
- Model Type: GPT-2 |
|
- Language: Tigrinya |
|
- Vocabulary Size: 16000 |
|
- Maximum Length: 128 |
|
- Model Size: Small |
|
- Number of Parameters: 33,523,200 |
|
|
|
## Training Details |
|
|
|
- Number of Epochs: 12 |
|
- Batch Size: 1 (with gradient accumulation steps of 4) |
|
- Learning Rate: 5e-4 |
|
|
|
## Dataset Statistics |
|
- Total number of words: 16061839 |
|
- Total number of unique words: 458901 |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
# Load the model |
|
generator = pipeline('text-generation', model='luel/gpt2-tigrinya-small') |
|
|
|
# Generate text |
|
text = generator("ትግራይ", max_length=60) |
|
print(text) |