File size: 785 Bytes
fb95af7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# GPT-2 for Tigrinya Language

This repository contains a GPT-2 model trained from scratch on Tigrinya text data. The model was trained using the Hugging Face Transformers library.

## Model Details

- Model Type: GPT-2
- Language: Tigrinya
- Vocabulary Size: 16000
- Maximum Length: 128
- Model Size: Small
- Number of Parameters: 33,523,200

## Training Details

- Number of Epochs: 12
- Batch Size: 1 (with gradient accumulation steps of 4)
- Learning Rate: 5e-4

## Dataset Statistics
- Total number of words: 16061839
- Total number of unique words: 458901

## Usage

```python
from transformers import pipeline

# Load the model
generator = pipeline('text-generation', model='luel/gpt2-tigrinya-small')

# Generate text
text = generator("ትግራይ", max_length=60)
print(text)