JordiAb
/

BART_news_summarizer

@@ -3,17 +3,48 @@ language:
 - en
 pipeline_tag: summarization
 ---
-News articles teacher-student abstractive summarizer model fine-tuned from BART-large and which used `StableBeluga-7B` as teacher.
-DataSet consists of 295,174 news articles scrapped from a Mexican Newspaper, along with its summary. For simplicity, the Spanish news articles were translated to English using `Helsinki-NLP/opus-mt-es-en` NLP model.
-Summaries teacher observations were created using `StableBeluga-7B`. The teacher observations are then used for fine tuning a BART lightweight model.
-The objective for this is to have a lightweight model that can perform summarization as good as `StableBeluga-7B`, much faster and with much less computing resources.
-We achieved very similar summary results (.66 ROUGE1 and .90 cosine similarity) on a validation DataSet with the lightweight BART model, 3x faster predictions and considerably less GPU memory usage.
-How to use:
 ```python

 - en
 pipeline_tag: summarization
 ---
+# Model Overview
+The News Articles Teacher-Student Abstractive Summarizer is a fine-tuned model based on BART-large, utilizing StableBeluga-7B as the teacher model. This model is designed to provide high-quality abstractive summarization of news articles with improved efficiency in terms of speed and computational resource usage.
+# Model Details
+- Model Type: Abstractive Summarization
+- Base Model: BART-large
+- Teacher Model: StableBeluga-7B
+- Language: English
+# DataSet
+- Source: 295,174 news articles scrapped from a Mexican newspaper.
+- Translation: The Spanish articles were translated to English using the Helsinki-NLP/opus-mt-es-en NLP model.
+- Teacher Summaries: Generated by StableBeluga-7B.
+# Training
+The fine-tuning process involved using the teacher observations (summaries) generated by StableBeluga-7B to train a lightweight BART model. This approach aims to replicate the summarization quality of the teacher model while achieving faster inference times and reduced GPU memory usage.
+# Performance
+- Evaluation Metrics:
+- - ROUGE1: 0.66
+- - Cosine Similarity: 0.90
+- Inference Speed: 3x faster than the teacher model (StableBeluga-7B)
+- Resource Usage: Significantly less GPU memory compared to StableBeluga-7B
+# Objective
+The primary goal of this model is to provide a lightweight summarization solution that maintains high-quality output similar to the teacher model (StableBeluga-7B) but operates with greater efficiency, making it suitable for deployment in resource-constrained environments.
+# Use Cases
+This model is ideal for applications requiring quick and efficient summarization of large volumes of news articles, particularly in settings where computational resources are limited.
+# Limitations
+- Language Translation: The initial translation from Spanish to English may introduce minor inaccuracies that could affect the summarization quality.
+- Domain Specificity: Fine-tuned specifically on news articles, performance may vary on texts from different domains.
+# Future Work
+Future improvements could involve:
+- Fine-tuning the model on bilingual data to eliminate translation steps.
+- Expanding the dataset to include a wider variety of news sources and topics.
+- Exploring further optimizations to reduce inference time and resource usage.
+# Conclusion
+The News Articles Teacher-Student Abstractive Summarizer model demonstrates the potential to deliver high-quality summaries efficiently, making it a valuable tool for news content processing and similar applications.
+# How to use:
 ```python