metadata
language:
- en
pipeline_tag: token-classification
license: apache-2.0
Named Entity Recognition (NER) model to recognize gene and protein entities.
Please cite our work:
@article{NILNKER2022,
title = {NILINKER: Attention-based approach to NIL Entity Linking},
journal = {Journal of Biomedical Informatics},
volume = {132},
pages = {104137},
year = {2022},
issn = {1532-0464},
doi = {https://doi.org/10.1016/j.jbi.2022.104137},
url = {https://www.sciencedirect.com/science/article/pii/S1532046422001526},
author = {Pedro Ruas and Francisco M. Couto},
}
PubMedBERT fine-tuned on the following datasets:
- miRNA-Test-Corpus: entity type "Genes/Proteins"
- CellFinder: entity type "GeneProtein"
- CoMAGC: entity "Gene"
- CRAFT: entity type "PR"
- GREC Corpus: entity types "Gene", "Protein", "Protein_Complex", "Enzyme"
- JNLPBA: entity types "protein", "DNA", "RNA"
- PGxCorpus: entity type "Gene_or_protein"
- FSU_PRGE: entity types "protein", "protein_complex", "protein_familiy_or_group"
- BC2GM corpus- : entity type
- CHEMPROT: entity types "GENE-Y", "GENE-N"
- mTOR pathway event corpus: entity type "Protein"
- DNA Methylation
- BioNLP11ID: entity type "Gene/protein"
- BioNLP09
- BioNLP11EPI
- BioNLP13CG: entity type "gene_or_gene_product"
- BioNLP13GE: entity type "Protein"
- BioNLP13PC: entity type "Gene_or_gene_product"
- MLEE: entity type "Gene_or_gene_product"