metadata

language: en
license: apache-2.0
tags:
  - summarization
datasets:
  - scientific_papers
model-index:
  - name: google/bigbird-pegasus-large-pubmed
    results:
      - task:
          type: summarization
          name: Summarization
        dataset:
          name: scientific_papers
          type: scientific_papers
          config: pubmed
          split: test
        metrics:
          - type: rouge
            value: 40.8966
            name: ROUGE-1
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZjhmMTg1M2FmMGNhMjJjMzJmMDgzZTZkN2Q3ZDcyZmJhZjZiMWRhZDYxYWU0OTM4MDc5M2RlYjk4OTY4MTk2NCIsInZlcnNpb24iOjF9.SoR8ISzeiIRmDW8UWhtxSX1a7A2DZWbjGMlPdUEXasBvXQsOTOAEfEk7XI-6Ah5aCnXyYT9FnzY8xQl9c_66Cw
          - type: rouge
            value: 18.1161
            name: ROUGE-2
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYTdlMjU1MGE2YTQxZmNjMzU0YmNjNTM5OThhMjFiNGJhOThkNWY0YTQxNDFmZTg5MzliNmUzNmI3NDEwMWE3YSIsInZlcnNpb24iOjF9.BA8OVHy_Pk0lMZON9C42Uu6gd9N_b4etNSduuguAE_dd0PjX0Lw5S_0N7lPD722ro5AjBXHSHcj10BwxsRUsAA
          - type: rouge
            value: 26.1743
            name: ROUGE-L
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMGFjYWM1NzBjOTM3Nzg4ODU2MjFlNjAyNDQ5NjQ5YzQ1YmIzMDNlZDE2ZjE1MjZhYTkyMzI5Mzc4MDQ5NDk2MyIsInZlcnNpb24iOjF9.LBFqVbt8MHdJVQ_LiNb6wqVCBRKVnE4OVVUWwsVg6HX0-jnMga1ASEnURtVUvQhk84-gkiPeZZSE4SjKNFulDQ
          - type: rouge
            value: 34.2773
            name: ROUGE-LSUM
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiN2Q4ODA3YmFjNGU2ZjBmMWFkYTE1NmUwNzk4Mjk4NmUxZThlM2QyNWQwMzNhN2VkZTU1MTI4ZTY4ZGI0NTQxOCIsInZlcnNpb24iOjF9.D37tnGTOvAKEl5CujVGLGICQPRv9yM5DU3PQJdQyxOIiyNe367bqjmVr00VvLmpQ0VNOZGM9VaycR_dmh_DDDQ
          - type: loss
            value: 2.1707184314727783
            name: loss
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNGZlNjRkYzI1YzFlNDJmZmI2ZTI4OGJkZTZjY2QyYWQyZTA4NzEzNzY2ODIwNDVkNTZlNGEyYjZiNTk3NjQ4MiIsInZlcnNpb24iOjF9.8ioVz9nOz4OybNKDCRTKZqGXeLgT5TTz9Bj8yWLKNrFhOI_nTg0O-ZpZDyq7uQUkv0fOQz8ZKAGqHWQfwNeNAw
          - type: meteor
            value: 0.3513
            name: meteor
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZTA0NTVjZjU1OGU1ODFlODIxZDU0YmYxODIwN2ZmNGM3YjkyMTFiNjMyYzA4MTc1ZjA0YzczYzgwMzE2N2JiOSIsInZlcnNpb24iOjF9.DfmgfbhlCusjv5hh9ND0VEFjbJz7to8_qXH5meU37SIZP-2ApgqShNjAjcRw2nRlgTH9fsrcALwg6zb-41XDDA
          - type: gen_len
            value: 221.2531
            name: gen_len
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNWFjNDM3YTAzNGU5NjU0Njg4MWUyZGM0OWJjMTNiMjJjMDRhOTcyMDM0NzdkODNhMDJhZTc0OTJkOTI3YjFmMyIsInZlcnNpb24iOjF9.NieaGIGTbAVP881vaD8zUHzmudvKDaf6Xv3O85TmjsE_rUnBqzF1uRBjfxsNSPZOaAZbRcqffL2Hh-RCcsXrBw
      - task:
          type: summarization
          name: Summarization
        dataset:
          name: scientific_papers
          type: scientific_papers
          config: arxiv
          split: test
        metrics:
          - type: rouge
            value: 40.3815
            name: ROUGE-1
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZjc2YmNiZTQ3YWUxYmE3NzM0Yzc2YWMxZTlhMzc3MjMyMmQ0MWJiYWUyZDA1MWExYjQ0ODY4YzM4MzgyNWZiYyIsInZlcnNpb24iOjF9.QoJdI0BEjb08nJe1mSMFxzqHfni7cOCuDdNS82Xg0G4R9uSKDboQhiLXslFup74c0a2O7bTwWasQHu-mtng4Ag
          - type: rouge
            value: 14.374
            name: ROUGE-2
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiY2QyMDQ3Y2RjYTY1NjhhYzcxN2IwMGQ3YzU3Zjk4MDMyNTM4MmE5YjE4MmUzNzM3YjVhYTA2YjBiNjI0YzEyNCIsInZlcnNpb24iOjF9.84fv-gyLKj-cljtydFclw9_F18MLiLlbhrBxFCDFYdX31R7zLLfd382JllPfZI9no7cIB9ga-eUvtIQjJXSJCw
          - type: rouge
            value: 23.4773
            name: ROUGE-L
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYjdiZWJhZjNlODY0YzMzOTc4MmZmMmUxZmQyMjgwZDMxY2Q2MmM3Y2M5MmNhZjRmNjg2M2YwNWFlOTY4MzZlMSIsInZlcnNpb24iOjF9.6WEJPyxVyirjAD3NK3z2FLguYH7iGXsQGd5R8j_5paBAihrmndm02pTODhNMN-ANjJSxylvuzElUVBTTDm0sAw
          - type: rouge
            value: 33.772
            name: ROUGE-LSUM
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMjRkZDkxYTlkYzFhMDYzNWMyZjEwMDU1YzY5YzNiYjFlYjY0ZDZmODVmNWEzYjVhZTMzNDI5ZTM0M2VlNTllMSIsInZlcnNpb24iOjF9.u90bbTq2shxIrcDd2MxoEHbHs9ZBIenLiEhTYWIFnFiXHafXmLdnsxmjWnFXsT2tO_gCFPwYhx2Qla-9BpK8AQ
          - type: loss
            value: 3.235051393508911
            name: loss
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZWIyYTBhMTBiZWY1OGIzODk5Y2NjYzgyOWY0MjUwZmFkM2ZlZDhiNGY4ZTI3NWUxYzZhOTg1M2M2NzI3MTBkYyIsInZlcnNpb24iOjF9.HjwTRnmITF5d8zNH7WU-riPfpYgxKUBtxT6r3t2dp92ReMVl3CPk6GdWfZsrRcPmV3F7eZ7jqPuCy2wa-N1sDA
          - type: gen_len
            value: 186.2003
            name: gen_len
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNmE5N2FjNjAzNTgyNzEzZTUxYzk2NWRmZDk5YWU5ZTAxMjEwMjZjYThmYjM1OWE0ZDc3MmZlOWEyMDk4YWQ5ZSIsInZlcnNpb24iOjF9.cBmsTsmCN8MaMOp20q95u23oi1YV1G8MWzvUGwYK7I3JblTPmvL0uw8K5_6RMuZJjm6GWSpKp-CwK3styoyTAQ

BigBirdPegasus model (large)

BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the capabilities of a complete transformer that the sparse model can handle.

BigBird was introduced in this paper and first released in this repository.

Disclaimer: The team releasing BigBird did not write a model card for this model so this model card has been written by the Hugging Face team.

Model description

BigBird relies on block sparse attention instead of normal attention (i.e. BERT's attention) and can handle sequences up to a length of 4096 at a much lower compute cost compared to BERT. It has achieved SOTA on various tasks involving very long sequences such as long documents summarization, question-answering with long contexts.

How to use

Here is how to use this model to get the features of a given text in PyTorch:

from transformers import BigBirdPegasusForConditionalGeneration, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("google/bigbird-pegasus-large-pubmed")

# by default encoder-attention is `block_sparse` with num_random_blocks=3, block_size=64
model = BigBirdPegasusForConditionalGeneration.from_pretrained("google/bigbird-pegasus-large-pubmed")

# decoder attention type can't be changed & will be "original_full"
# you can change `attention_type` (encoder only) to full attention like this:
model = BigBirdPegasusForConditionalGeneration.from_pretrained("google/bigbird-pegasus-large-pubmed", attention_type="original_full")

# you can change `block_size` & `num_random_blocks` like this:
model = BigBirdPegasusForConditionalGeneration.from_pretrained("google/bigbird-pegasus-large-pubmed", block_size=16, num_random_blocks=2)

text = "Replace me by any text you'd like."
inputs = tokenizer(text, return_tensors='pt')
prediction = model.generate(**inputs)
prediction = tokenizer.batch_decode(prediction)

Training Procedure

This checkpoint is obtained after fine-tuning BigBirdPegasusForConditionalGeneration for summarization on pubmed dataset from scientific_papers.

BibTeX entry and citation info

@misc{zaheer2021big,
      title={Big Bird: Transformers for Longer Sequences}, 
      author={Manzil Zaheer and Guru Guruganesh and Avinava Dubey and Joshua Ainslie and Chris Alberti and Santiago Ontanon and Philip Pham and Anirudh Ravula and Qifan Wang and Li Yang and Amr Ahmed},
      year={2021},
      eprint={2007.14062},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}