metadata
datasets:
- mbshr/XSUMUrdu-DW_BBC
language:
- ur
metrics:
- rouge
- bertscore
pipeline_tag: summarization
Model Card for Model ID
Summarization Model (Type:T5)
Summarization: Extractive and Abstractive
- urT5 adapted from mT5 having monolingual vocabulary only; 40k tokens of Urdu.
- Fine-tuned on https://huggingface.co/mbshr/XSUMUrdu-DW_BBC, ref to https://doi.org/10.48550/arXiv.2310.02790 for details.
Model Details
Model Description
- Developed by: [More Information Needed]
- Shared by [optional]: [More Information Needed]
- Model type: urT5 adapted version of mT5
- Language(s) (NLP): Urdu
- License: [More Information Needed]
- Finetuned from model [optional]: google/mt5-base
Model Sources [optional]
- Repository: [More Information Needed]
- Paper [optional]: https://doi.org/10.48550/arXiv.2310.02790
Uses
Summarization
How to Get Started with the Model
Use the code below to get started with the model.
[More Information Needed]
Training Details
Evaluation & Results
Evaluated on https://huggingface.co/mbshr/XSUMUrdu-DW_BBC
- ROUGE-1 F Score: 40.03 combined, 46.35 BBC Urdu datapoints only and 36.91 DW Urdu datapoints only)
- BERTScore: 75.1 combined, 77.0 BBC Urdu datapoints only and 74.16 DW Urdu datapoints only
Citation [optional]
@misc{munaf2023low, title={Low Resource Summarization using Pre-trained Language Models}, author={Mubashir Munaf and Hammad Afzal and Naima Iltaf and Khawir Mahmood}, year={2023}, eprint={2310.02790}, archivePrefix={arXiv}, primaryClass={cs.CL} }