--- language: en license: apache-2.0 tags: - summarization datasets: - scientific_papers model-index: - name: google/bigbird-pegasus-large-pubmed results: - task: type: summarization name: Summarization dataset: name: scientific_papers type: scientific_papers config: pubmed split: test metrics: - type: rouge value: 40.8966 name: ROUGE-1 verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZjhmMTg1M2FmMGNhMjJjMzJmMDgzZTZkN2Q3ZDcyZmJhZjZiMWRhZDYxYWU0OTM4MDc5M2RlYjk4OTY4MTk2NCIsInZlcnNpb24iOjF9.SoR8ISzeiIRmDW8UWhtxSX1a7A2DZWbjGMlPdUEXasBvXQsOTOAEfEk7XI-6Ah5aCnXyYT9FnzY8xQl9c_66Cw - type: rouge value: 18.1161 name: ROUGE-2 verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYTdlMjU1MGE2YTQxZmNjMzU0YmNjNTM5OThhMjFiNGJhOThkNWY0YTQxNDFmZTg5MzliNmUzNmI3NDEwMWE3YSIsInZlcnNpb24iOjF9.BA8OVHy_Pk0lMZON9C42Uu6gd9N_b4etNSduuguAE_dd0PjX0Lw5S_0N7lPD722ro5AjBXHSHcj10BwxsRUsAA - type: rouge value: 26.1743 name: ROUGE-L verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMGFjYWM1NzBjOTM3Nzg4ODU2MjFlNjAyNDQ5NjQ5YzQ1YmIzMDNlZDE2ZjE1MjZhYTkyMzI5Mzc4MDQ5NDk2MyIsInZlcnNpb24iOjF9.LBFqVbt8MHdJVQ_LiNb6wqVCBRKVnE4OVVUWwsVg6HX0-jnMga1ASEnURtVUvQhk84-gkiPeZZSE4SjKNFulDQ - type: rouge value: 34.2773 name: ROUGE-LSUM verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiN2Q4ODA3YmFjNGU2ZjBmMWFkYTE1NmUwNzk4Mjk4NmUxZThlM2QyNWQwMzNhN2VkZTU1MTI4ZTY4ZGI0NTQxOCIsInZlcnNpb24iOjF9.D37tnGTOvAKEl5CujVGLGICQPRv9yM5DU3PQJdQyxOIiyNe367bqjmVr00VvLmpQ0VNOZGM9VaycR_dmh_DDDQ - type: loss value: 2.1707184314727783 name: loss verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNGZlNjRkYzI1YzFlNDJmZmI2ZTI4OGJkZTZjY2QyYWQyZTA4NzEzNzY2ODIwNDVkNTZlNGEyYjZiNTk3NjQ4MiIsInZlcnNpb24iOjF9.8ioVz9nOz4OybNKDCRTKZqGXeLgT5TTz9Bj8yWLKNrFhOI_nTg0O-ZpZDyq7uQUkv0fOQz8ZKAGqHWQfwNeNAw - type: meteor value: 0.3513 name: meteor verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZTA0NTVjZjU1OGU1ODFlODIxZDU0YmYxODIwN2ZmNGM3YjkyMTFiNjMyYzA4MTc1ZjA0YzczYzgwMzE2N2JiOSIsInZlcnNpb24iOjF9.DfmgfbhlCusjv5hh9ND0VEFjbJz7to8_qXH5meU37SIZP-2ApgqShNjAjcRw2nRlgTH9fsrcALwg6zb-41XDDA - type: gen_len value: 221.2531 name: gen_len verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNWFjNDM3YTAzNGU5NjU0Njg4MWUyZGM0OWJjMTNiMjJjMDRhOTcyMDM0NzdkODNhMDJhZTc0OTJkOTI3YjFmMyIsInZlcnNpb24iOjF9.NieaGIGTbAVP881vaD8zUHzmudvKDaf6Xv3O85TmjsE_rUnBqzF1uRBjfxsNSPZOaAZbRcqffL2Hh-RCcsXrBw - task: type: summarization name: Summarization dataset: name: scientific_papers type: scientific_papers config: arxiv split: test metrics: - type: rouge value: 40.3815 name: ROUGE-1 verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZjc2YmNiZTQ3YWUxYmE3NzM0Yzc2YWMxZTlhMzc3MjMyMmQ0MWJiYWUyZDA1MWExYjQ0ODY4YzM4MzgyNWZiYyIsInZlcnNpb24iOjF9.QoJdI0BEjb08nJe1mSMFxzqHfni7cOCuDdNS82Xg0G4R9uSKDboQhiLXslFup74c0a2O7bTwWasQHu-mtng4Ag - type: rouge value: 14.374 name: ROUGE-2 verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiY2QyMDQ3Y2RjYTY1NjhhYzcxN2IwMGQ3YzU3Zjk4MDMyNTM4MmE5YjE4MmUzNzM3YjVhYTA2YjBiNjI0YzEyNCIsInZlcnNpb24iOjF9.84fv-gyLKj-cljtydFclw9_F18MLiLlbhrBxFCDFYdX31R7zLLfd382JllPfZI9no7cIB9ga-eUvtIQjJXSJCw - type: rouge value: 23.4773 name: ROUGE-L verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYjdiZWJhZjNlODY0YzMzOTc4MmZmMmUxZmQyMjgwZDMxY2Q2MmM3Y2M5MmNhZjRmNjg2M2YwNWFlOTY4MzZlMSIsInZlcnNpb24iOjF9.6WEJPyxVyirjAD3NK3z2FLguYH7iGXsQGd5R8j_5paBAihrmndm02pTODhNMN-ANjJSxylvuzElUVBTTDm0sAw - type: rouge value: 33.772 name: ROUGE-LSUM verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMjRkZDkxYTlkYzFhMDYzNWMyZjEwMDU1YzY5YzNiYjFlYjY0ZDZmODVmNWEzYjVhZTMzNDI5ZTM0M2VlNTllMSIsInZlcnNpb24iOjF9.u90bbTq2shxIrcDd2MxoEHbHs9ZBIenLiEhTYWIFnFiXHafXmLdnsxmjWnFXsT2tO_gCFPwYhx2Qla-9BpK8AQ - type: loss value: 3.235051393508911 name: loss verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZWIyYTBhMTBiZWY1OGIzODk5Y2NjYzgyOWY0MjUwZmFkM2ZlZDhiNGY4ZTI3NWUxYzZhOTg1M2M2NzI3MTBkYyIsInZlcnNpb24iOjF9.HjwTRnmITF5d8zNH7WU-riPfpYgxKUBtxT6r3t2dp92ReMVl3CPk6GdWfZsrRcPmV3F7eZ7jqPuCy2wa-N1sDA - type: gen_len value: 186.2003 name: gen_len verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNmE5N2FjNjAzNTgyNzEzZTUxYzk2NWRmZDk5YWU5ZTAxMjEwMjZjYThmYjM1OWE0ZDc3MmZlOWEyMDk4YWQ5ZSIsInZlcnNpb24iOjF9.cBmsTsmCN8MaMOp20q95u23oi1YV1G8MWzvUGwYK7I3JblTPmvL0uw8K5_6RMuZJjm6GWSpKp-CwK3styoyTAQ --- # BigBirdPegasus model (large) BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the capabilities of a complete transformer that the sparse model can handle. BigBird was introduced in this [paper](https://arxiv.org/abs/2007.14062) and first released in this [repository](https://github.com/google-research/bigbird). Disclaimer: The team releasing BigBird did not write a model card for this model so this model card has been written by the Hugging Face team. ## Model description BigBird relies on **block sparse attention** instead of normal attention (i.e. BERT's attention) and can handle sequences up to a length of 4096 at a much lower compute cost compared to BERT. It has achieved SOTA on various tasks involving very long sequences such as long documents summarization, question-answering with long contexts. ## How to use Here is how to use this model to get the features of a given text in PyTorch: ```python from transformers import BigBirdPegasusForConditionalGeneration, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("google/bigbird-pegasus-large-pubmed") # by default encoder-attention is `block_sparse` with num_random_blocks=3, block_size=64 model = BigBirdPegasusForConditionalGeneration.from_pretrained("google/bigbird-pegasus-large-pubmed") # decoder attention type can't be changed & will be "original_full" # you can change `attention_type` (encoder only) to full attention like this: model = BigBirdPegasusForConditionalGeneration.from_pretrained("google/bigbird-pegasus-large-pubmed", attention_type="original_full") # you can change `block_size` & `num_random_blocks` like this: model = BigBirdPegasusForConditionalGeneration.from_pretrained("google/bigbird-pegasus-large-pubmed", block_size=16, num_random_blocks=2) text = "Replace me by any text you'd like." inputs = tokenizer(text, return_tensors='pt') prediction = model.generate(**inputs) prediction = tokenizer.batch_decode(prediction) ``` ## Training Procedure This checkpoint is obtained after fine-tuning `BigBirdPegasusForConditionalGeneration` for **summarization** on **pubmed dataset** from [scientific_papers](https://huggingface.co/datasets/scientific_papers). ## BibTeX entry and citation info ```tex @misc{zaheer2021big, title={Big Bird: Transformers for Longer Sequences}, author={Manzil Zaheer and Guru Guruganesh and Avinava Dubey and Joshua Ainslie and Chris Alberti and Santiago Ontanon and Philip Pham and Anirudh Ravula and Qifan Wang and Li Yang and Amr Ahmed}, year={2021}, eprint={2007.14062}, archivePrefix={arXiv}, primaryClass={cs.LG} } ```