Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

FLAN-T5-Base pre-trained on Historical Text Completion

This model is a pre-trained version of google/flan-t5-base fine-tuned on historical text completion tasks.

Model description

This model has been trained to complete historical texts, enhancing its understanding of historical contexts and language.

Intended uses & limitations

This model is intended for further fine-tuning on specific historical NLP tasks or for generating historically-aware text completions.

Training and evaluation data

The model was trained on a subset of the dataset ambrosfitz/just_history_xl_masked, limited by available GPU memory.

Training procedure

The model was trained using the following hyperparameters:

  • Number of epochs: 3
  • Batch size: 2
  • Learning rate: 5e-05
  • Gradient Accumulation Steps: 8
  • Weight Decay: 0.01

Results

Evaluation results: {'eval_loss': nan, 'eval_runtime': 0.7866, 'eval_samples_per_second': 38.139, 'eval_steps_per_second': 19.069, 'epoch': 2.909090909090909}

Downloads last month
2
Safetensors
Model size
248M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .