privetin
/

model-1

Inference Endpoints

Model card Files Files and versions Community

model-1 / README.md

privetin's picture

Update README.md

a5d5962 verified 2 months ago

|

2.27 kB

metadata

license: mit
datasets:
  - abisee/cnn_dailymail
language:
  - en
metrics:
  - rouge
  - bleu
base_model:
  - google-t5/t5-small
pipeline_tag: summarization
library_name: transformers

Model Card for t5_small Summarization Model

Model Details

Model Architecture: T5 (Text-to-Text Transfer Transformer)
Variant: t5-small
Task: Text Summarization
Framework: Hugging Face Transformers

Training Data

Dataset: CNN/DailyMail
Content: News articles and their summaries
Size: Approximately 300,000 article-summary pairs

Training Procedure

Fine-tuning method: Using Hugging Face Transformers library
Hyperparameters:
- Learning rate: 5e-5
- Batch size: 8
- Number of epochs: 3
Optimizer: AdamW

How to Use

Install the Hugging Face Transformers library:

pip install transformers

Load the model:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("t5-small")
model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")

Generate a summary:

input_text = "Your input text here"
inputs = tokenizer("summarize: " + input_text, return_tensors="pt", max_length=512, truncation=True)
summary_ids = model.generate(inputs["input_ids"], max_length=150, min_length=40, length_penalty=2.0, num_beams=4, early_stopping=True)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

Evaluation

Metric: ROUGE scores (Recall-Oriented Understudy for Gisting Evaluation)
Exact scores not available, but typically evaluated on:
- ROUGE-1 (unigram overlap)
- ROUGE-2 (bigram overlap)
- ROUGE-L (longest common subsequence)

Limitations

Performance may be lower compared to larger T5 variants
Optimized for news article summarization, may not perform as well on other text types
Limited to input sequences of 512 tokens
Generated summaries may sometimes contain factual inaccuracies

Ethical Considerations

May inherit biases present in the CNN/DailyMail dataset
Not suitable for summarizing sensitive or critical information without human review
Users should be aware of potential biases and inaccuracies in generated summaries
Should not be used as a sole source of information for decision-making processes