luel's picture
minor change
3bd863e verified
metadata
language: ti
license: mit
library_name: transformers
tags:
  - tigrinya
  - gpt2
  - text-generation
metrics:
  - perplexity
  - loss
pipeline_tag: text-generation

Model Card for GPT-2 Tigrinya Medium

Model Summary

This is a GPT-2 model trained from scratch on Tigrinya text data. It was trained on 20.6 million tokens, primarily from news sources.

Model Description

  • Model type: GPT-2
  • Language: Tigrinya (ትግርኛ)
  • Finetuned from model: Trained from scratch (no pre-training)

Model Architecture

  • Parameters: 51.9M
  • Context Window: 128 tokens
  • Vocabulary Size: 52,000

Training Details

  • Training regime: fp16 mixed precision
  • Number of Epochs: 12
  • Batch Size: 6 (with gradient accumulation steps of 8)
  • Learning Rate: 5e-4

Evaluation

  • Training Perplexity: 28.6
  • Training Loss: 3.12

Usage

from transformers import pipeline
# Load the model
generator = pipeline('text-generation', model='luel/gpt2-tigrinya-medium')

prompt = "ክልል ትግራይ"
# Generate text
text = generator(prompt, max_length=100)[0]['generated_text']
print(text)

Limitations

  • Limited context window of 128 tokens.
  • Best suited for medium-length Tigrinya text generation.
  • Outputs should be reviewed for accuracy.