News2Topic-T5-base

Model Details

  • Model type: Text-to-Text Generation
  • Language(s) (NLP): English
  • License: MIT License
  • Finetuned from model: T5 Base Model (Google AI)

Uses

The News2Topic T5-base model is designed for automatic generation of topic names from news articles or news-like text. It can be integrated into news aggregation platforms, content management systems, or used for enhancing news browsing and searching experiences by providing concise topics.

How to Get Started with the Model

from transformers import pipeline

pipe = pipeline("text2text-generation", model="textgain/News2Topic-T5-base")

news_text = "Your news text here."
print(pipe(news_text))

Training Details

The News2Topic T5-base model was trained on a 21K sample of the "Newsroom" dataset (https://lil.nlp.cornell.edu/newsroom/index.html) annotated with synthetic data generated by GPT-3.5-turbo

The model was trained for 3 epochs, with a learning rate of 0.00001, a maximum sequence length of 512, and a training batch size of 12.

Citation

BibTeX:

@article{Kosar_DePauw_Daelemans_2024,
title={Comparative Evaluation of Topic Detection: Humans vs. LLMs}, volume={13},
url={https://www.clinjournal.org/clinj/article/view/173}, journal={Computational Linguistics in the Netherlands Journal},
author={Kosar, Andriy and De Pauw, Guy and Daelemans, Walter},
year={2024},
month={Mar.},
pages={91–120} }
Downloads last month
1
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.