agentlans/c4-en-lowercased
Viewer • Updated • 125k • 105
A specialized fine-tuned version of google/flan-t5-small trained on the agentlans/c4-en-lowercased dataset to restore proper capitalization.
import torch
from transformers import pipeline
device = 0 if torch.cuda.is_available() else -1
model_name = "agentlans/flan-t5-small-capitalizer"
flan_t5_pipeline = pipeline("text2text-generation", model=model_name, device=device)
input_text = "buzzfeed's 360-degree look at the aftermath of california's valley fire has been viewed more than 6 million times. plenty of viewers have been asking how we made it."
output = flan_t5_pipeline(input_text, max_length=1024)
print(output[0]["generated_text"])
# Expected output: Buzzfeed's 360-degree look at the aftermath of California's Valley Fire has been viewed more than 6 million times. Plenty of viewers have been asking how we made it.
The model was trained on a subset of the C4 dataset's English configuration. This dataset contains 125,000 rows, split into 100,000 for training and 25,000 for validation. Each row includes the original text and its lowercased version. It achieves a final validation loss of 0.1338 after processing 56 941 616 input tokens.
The model was trained using the following key hyperparameters:
Additional training arguments included bf16 precision, automatic batch size finding, and the use of a sortish sampler.
| Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen |
|---|---|---|---|---|
| 0.2532 | 0.05 | 2500 | 0.1739 | 2824810 |
| 0.231 | 0.1 | 5000 | 0.1653 | 5702148 |
| 0.2163 | 0.15 | 7500 | 0.1571 | 8531178 |
| 0.1966 | 0.2 | 10000 | 0.1529 | 11350902 |
| 0.2013 | 0.25 | 12500 | 0.1491 | 14191502 |
| 0.1971 | 0.3 | 15000 | 0.1464 | 17050704 |
| 0.1791 | 0.35 | 17500 | 0.1447 | 19857804 |
| 0.193 | 0.4 | 20000 | 0.1424 | 22687180 |
| 0.1821 | 0.45 | 22500 | 0.1416 | 25532518 |
| 0.19 | 0.5 | 25000 | 0.1397 | 28423408 |
| 0.1753 | 0.55 | 27500 | 0.1388 | 31248170 |
| 0.184 | 0.6 | 30000 | 0.1378 | 34048604 |
| 0.1717 | 0.65 | 32500 | 0.1371 | 36903282 |
| 0.1693 | 0.7 | 35000 | 0.1359 | 39709784 |
| 0.1729 | 0.75 | 37500 | 0.1345 | 42614112 |
| 0.1711 | 0.8 | 40000 | 0.1344 | 45471178 |
| 0.1735 | 0.85 | 42500 | 0.1340 | 48355942 |
| 0.1797 | 0.9 | 45000 | 0.1340 | 51187066 |
| 0.1659 | 0.95 | 47500 | 0.1338 | 54074434 |
| 0.1658 | 1.0 | 50000 | 0.1338 | 56941616 |
Base model
google/flan-t5-small