julien-c HF staff commited on
Commit
6c87910
1 Parent(s): 9bfe4af

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/allenai/longformer-base-4096/README.md

Files changed (1) hide show
  1. README.md +24 -0
README.md ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # longformer-base-4096
3
+ [Longformer](https://arxiv.org/abs/2004.05150) is a transformer model for long documents.
4
+
5
+ `longformer-base-4096` is a BERT-like model started from the RoBERTa checkpoint and pretrained for MLM on long documents. It supports sequences of length up to 4,096.
6
+
7
+ Longformer uses a combination of a sliding window (local) attention and global attention. Global attention is user-configured based on the task to allow the model to learn task-specific representations.
8
+ Please refer to the examples in `modeling_longformer.py` and the paper for more details on how to set global attention.
9
+
10
+
11
+ ### Citing
12
+
13
+ If you use `Longformer` in your research, please cite [Longformer: The Long-Document Transformer](https://arxiv.org/abs/2004.05150).
14
+ ```
15
+ @article{Beltagy2020Longformer,
16
+ title={Longformer: The Long-Document Transformer},
17
+ author={Iz Beltagy and Matthew E. Peters and Arman Cohan},
18
+ journal={arXiv:2004.05150},
19
+ year={2020},
20
+ }
21
+ ```
22
+
23
+ `Longformer` is an open-source project developed by [the Allen Institute for Artificial Intelligence (AI2)](http://www.allenai.org).
24
+ AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering.