File size: 1,061 Bytes
fb5f8bf ca3f405 fb5f8bf 7ffb212 fb5f8bf 7ffb212 fb5f8bf |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
# BERTOverflow
## Model description
We pre-trained BERT-base model on 152 million sentences from the StackOverflow's 10 year archive. More details of this model can be found in our ACL 2020 paper: [Code and Named Entity Recognition in StackOverflow](https://www.aclweb.org/anthology/2020.acl-main.443/). We would like to thank [Wuwei Lan](https://lanwuwei.github.io/) for helping us in training this model.
#### How to use
```python
from transformers import *
import torch
tokenizer = AutoTokenizer.from_pretrained("jeniya/BERTOverflow")
model = AutoModelForTokenClassification.from_pretrained("jeniya/BERTOverflow")
```
### BibTeX entry and citation info
```bibtex
@inproceedings{tabassum2020code,
title={Code and Named Entity Recognition in StackOverflow},
author={Tabassum, Jeniya and Maddela, Mounica and Xu, Wei and Ritter, Alan },
booktitle = {Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL)},
url={https://www.aclweb.org/anthology/2020.acl-main.443/}
year = {2020},
}
``` |