metadata
language: ti
license: mit
library_name: transformers
tags:
- tigrinya
- gpt2
- text-generation
metrics:
- perplexity
- loss
pipeline_tag: text-generation
Model Card for GPT-2 Tigrinya Medium
Model Summary
This is a GPT-2 model trained from scratch on Tigrinya text data. It was trained on 20.6 million tokens, primarily from news sources.
Model Description
- Model type: GPT-2
- Language: Tigrinya (ትግርኛ)
- Finetuned from model: Trained from scratch (no pre-training)
Model Architecture
- Parameters: 51.9M
- Context Window: 128 tokens
- Vocabulary Size: 52,000
Training Details
- Training regime: fp16 mixed precision
- Number of Epochs: 12
- Batch Size: 6 (with gradient accumulation steps of 8)
- Learning Rate: 5e-4
Evaluation
- Training Perplexity: 28.6
- Training Loss: 3.12
Usage
from transformers import pipeline
# Load the model
generator = pipeline('text-generation', model='luel/gpt2-tigrinya-medium')
prompt = "ክልል ትግራይ"
# Generate text
text = generator(prompt, max_length=100)[0]['generated_text']
print(text)
Limitations
- Limited context window of 128 tokens.
- Best suited for medium-length Tigrinya text generation.
- Outputs should be reviewed for accuracy.