Update README.md
Browse files
README.md
CHANGED
@@ -3,17 +3,48 @@ language:
|
|
3 |
- en
|
4 |
pipeline_tag: summarization
|
5 |
---
|
6 |
-
|
|
|
7 |
|
8 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
|
12 |
-
|
|
|
13 |
|
14 |
-
|
|
|
|
|
|
|
|
|
15 |
|
16 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
|
18 |
|
19 |
```python
|
|
|
3 |
- en
|
4 |
pipeline_tag: summarization
|
5 |
---
|
6 |
+
# Model Overview
|
7 |
+
The News Articles Teacher-Student Abstractive Summarizer is a fine-tuned model based on BART-large, utilizing StableBeluga-7B as the teacher model. This model is designed to provide high-quality abstractive summarization of news articles with improved efficiency in terms of speed and computational resource usage.
|
8 |
|
9 |
+
# Model Details
|
10 |
+
- Model Type: Abstractive Summarization
|
11 |
+
- Base Model: BART-large
|
12 |
+
- Teacher Model: StableBeluga-7B
|
13 |
+
- Language: English
|
14 |
+
|
15 |
+
# DataSet
|
16 |
+
- Source: 295,174 news articles scrapped from a Mexican newspaper.
|
17 |
+
- Translation: The Spanish articles were translated to English using the Helsinki-NLP/opus-mt-es-en NLP model.
|
18 |
+
- Teacher Summaries: Generated by StableBeluga-7B.
|
19 |
+
# Training
|
20 |
+
The fine-tuning process involved using the teacher observations (summaries) generated by StableBeluga-7B to train a lightweight BART model. This approach aims to replicate the summarization quality of the teacher model while achieving faster inference times and reduced GPU memory usage.
|
21 |
|
22 |
+
# Performance
|
23 |
+
- Evaluation Metrics:
|
24 |
+
- - ROUGE1: 0.66
|
25 |
+
- - Cosine Similarity: 0.90
|
26 |
+
- Inference Speed: 3x faster than the teacher model (StableBeluga-7B)
|
27 |
+
- Resource Usage: Significantly less GPU memory compared to StableBeluga-7B
|
28 |
+
|
29 |
+
# Objective
|
30 |
+
The primary goal of this model is to provide a lightweight summarization solution that maintains high-quality output similar to the teacher model (StableBeluga-7B) but operates with greater efficiency, making it suitable for deployment in resource-constrained environments.
|
31 |
|
32 |
+
# Use Cases
|
33 |
+
This model is ideal for applications requiring quick and efficient summarization of large volumes of news articles, particularly in settings where computational resources are limited.
|
34 |
|
35 |
+
# Limitations
|
36 |
+
- Language Translation: The initial translation from Spanish to English may introduce minor inaccuracies that could affect the summarization quality.
|
37 |
+
- Domain Specificity: Fine-tuned specifically on news articles, performance may vary on texts from different domains.
|
38 |
+
# Future Work
|
39 |
+
Future improvements could involve:
|
40 |
|
41 |
+
- Fine-tuning the model on bilingual data to eliminate translation steps.
|
42 |
+
- Expanding the dataset to include a wider variety of news sources and topics.
|
43 |
+
- Exploring further optimizations to reduce inference time and resource usage.
|
44 |
+
# Conclusion
|
45 |
+
The News Articles Teacher-Student Abstractive Summarizer model demonstrates the potential to deliver high-quality summaries efficiently, making it a valuable tool for news content processing and similar applications.
|
46 |
+
|
47 |
+
# How to use:
|
48 |
|
49 |
|
50 |
```python
|