--- datasets: - mbshr/XSUMUrdu-DW_BBC language: - ur metrics: - rouge - bertscore pipeline_tag: summarization --- # Model Card for Model ID ### Summarization Model (Type:T5) Summarization: Extractive and Abstractive - urT5 adapted from mT5 having monolingual vocabulary only; 40k tokens of Urdu. - Fine-tuned on https://huggingface.co/mbshr/XSUMUrdu-DW_BBC, ref to https://doi.org/10.48550/arXiv.2310.02790 for details. ## Model Details ### Model Description - **Developed by:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Model type:** urT5 adapted version of mT5 - **Language(s) (NLP):** Urdu - **License:** [More Information Needed] - **Finetuned from model [optional]:** google/mt5-base ### Model Sources [optional] - **Repository:** [More Information Needed] - **Paper [optional]:** https://doi.org/10.48550/arXiv.2310.02790 ## Uses Summarization ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details ## Evaluation & Results Evaluated on https://huggingface.co/mbshr/XSUMUrdu-DW_BBC - ROUGE-1 F Score: 40.03 combined, 46.35 BBC Urdu datapoints only and 36.91 DW Urdu datapoints only) - BERTScore: 75.1 combined, 77.0 BBC Urdu datapoints only and 74.16 DW Urdu datapoints only ## Citation [optional] @misc{munaf2023low, title={Low Resource Summarization using Pre-trained Language Models}, author={Mubashir Munaf and Hammad Afzal and Naima Iltaf and Khawir Mahmood}, year={2023}, eprint={2310.02790}, archivePrefix={arXiv}, primaryClass={cs.CL} } ## Contact - mubashir.munaaf@gmail.com