Re-Punctuate:
Re-Punctuate is a T5 model that attempts to correct Capitalization and Punctuations in the sentences.
DataSet:
DialogSum dataset (115056 Records) was used to fine-tune the model for Punctuation and Capitalization correction.
Usage:
from transformers import T5Tokenizer, TFT5ForConditionalGeneration tokenizer = T5Tokenizer.from_pretrained('SJ-Ray/Re-Punctuate') model = TFT5ForConditionalGeneration.from_pretrained('SJ-Ray/Re-Punctuate') input_text = 'the story of this brave brilliant athlete whose very being was questioned so publicly is one that still captures the imagination' inputs = tokenizer.encode("punctuate: " + input_text, return_tensors="tf") result = model.generate(inputs) decoded_output = tokenizer.decode(result[0], skip_special_tokens=True) print(decoded_output)
Example:
Input: the story of this brave brilliant athlete whose very being was questioned so publicly is one that still captures the imaginationOutput: The story of this brave, brilliant athlete, whose very being was questioned so publicly, is one that still captures the imagination.
Connect on: LinkedIn : Suraj Kumar
- Downloads last month
- 319
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.