|
--- |
|
language: |
|
- en |
|
pipeline_tag: token-classification |
|
license: apache-2.0 |
|
--- |
|
|
|
Named Entity Recognition (NER) model to recognize gene and protein entities. |
|
|
|
Please cite our work: |
|
|
|
``` |
|
@article{NILNKER2022, |
|
title = {NILINKER: Attention-based approach to NIL Entity Linking}, |
|
journal = {Journal of Biomedical Informatics}, |
|
volume = {132}, |
|
pages = {104137}, |
|
year = {2022}, |
|
issn = {1532-0464}, |
|
doi = {https://doi.org/10.1016/j.jbi.2022.104137}, |
|
url = {https://www.sciencedirect.com/science/article/pii/S1532046422001526}, |
|
author = {Pedro Ruas and Francisco M. Couto}, |
|
} |
|
``` |
|
|
|
[PubMedBERT](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext) fine-tuned on the following datasets: |
|
|
|
- [miRNA-Test-Corpus](https://www.scai.fraunhofer.de/en/business-research-areas/bioinformatics/downloads/download-mirna-test-corpus.html): entity type "Genes/Proteins" |
|
- [CellFinder](https://www.informatik.hu-berlin.de/de/forschung/gebiete/wbi/resources/cellfinder/): entity type "GeneProtein" |
|
- [CoMAGC](http://biopathway.org/CoMAGC/): entity "Gene" |
|
- [CRAFT](https://github.com/UCDenver-ccp/CRAFT/tree/master/concept-annotation): entity type "PR" |
|
- [GREC Corpus](http://www.nactem.ac.uk/GREC/standoff.php): entity types "Gene", "Protein", "Protein_Complex", "Enzyme" |
|
- [JNLPBA](http://www.geniaproject.org/shared-tasks/bionlp-jnlpba-shared-task-2004): entity types "protein", "DNA", "RNA" |
|
- [PGxCorpus](https://www.nature.com/articles/s41597-019-0342-9): entity type "Gene_or_protein" |
|
- [FSU_PRGE](https://julielab.de/Resources/FSU_PRGE.html): entity types "protein", "protein_complex", "protein_familiy_or_group" |
|
- [BC2GM corpus](https://github.com/spyysalo/bc2gm-corpus)- [](): entity type |
|
- [CHEMPROT](https://biocreative.bioinformatics.udel.edu/resources/corpora/chemprot-corpus-biocreative-vi/): entity types "GENE-Y", "GENE-N" |
|
- [mTOR pathway event corpus](https://github.com/openbiocorpora/mtor-pathway/tree/master/original-data): entity type "Protein" |
|
- [DNA Methylation](https://github.com/openbiocorpora/dna-methylation/tree/master/original-data) |
|
- [BioNLP11ID](https://github.com/cambridgeltl/MTL-Bioinformatics-2016/tree/master/data/BioNLP11ID-ggp-IOB): entity type "Gene/protein" |
|
- [BioNLP09](https://github.com/cambridgeltl/MTL-Bioinformatics-2016/tree/master/data/BioNLP09-IOB) |
|
- [BioNLP11EPI](https://github.com/cambridgeltl/MTL-Bioinformatics-2016/tree/master/data/BioNLP11EPI-IOB) |
|
- [BioNLP13CG](https://github.com/cambridgeltl/MTL-Bioinformatics-2016/tree/master/data/BioNLP13CG-ggp-IOB): entity type "gene_or_gene_product" |
|
- [BioNLP13GE](https://github.com/cambridgeltl/MTL-Bioinformatics-2016/tree/master/data/BioNLP13GE-IOB): entity type "Protein" |
|
- [BioNLP13PC](https://github.com/cambridgeltl/MTL-Bioinformatics-2016/tree/master/data/BioNLP13PC-ggp-IOB): entity type "Gene_or_gene_product" |
|
- [MLEE](http://nactem.ac.uk/MLEE/): entity type "Gene_or_gene_product" |