NER model based on allenai/scibert_scivocab_cased
Fine-tuned using the SciERC Dataset to identify scientific terms:
- Task: Applications, problems to solve, systems to construct. E.g. information extraction, machine reading system, image segmentation, etc.
- Method: Methods , models, systems to use, or tools, components of a system, frameworks. E.g. language model, CORENLP, POS parser, kernel method, etc. • Evaluation Metric: Metrics, measures, or entities that can express the quality of a system/method. E.g. F1, BLEU, Precision, Recall, ROC curve, mean reciprocal rank, mean-squared error, robustness, time complexity, etc.
- Material: Data, datasets, resources, Corpus, Knowledge base. E.g. image data, speech data, stereo images, bilingual dictionary, paraphrased questions, CoNLL, Panntreebank, WordNet, Wikipedia, etc.
- Other Scientific Terms: Phrases that are scientific terms but do not fall into any of the above classes E.g. physical or geometric constraints, qualitative prior knowledge, discourse structure, syntactic rule, discourse structure, tree, node, tree kernel, features, noise, criteria
- Generic: General terms or pronouns that may refer to an entity but are not themselves informative, often used as connection words. E.g model, approach, prior knowledge, them, it...
Training
- Learning Rate: 1e-05
- Epochs: 10,
Performance
- Eval Loss: 0.401
- Precision 0.577
- Recall: 0.632
- F1: 0.603
Colab
Check out how this model is used for NER-enhanced topic modelling, inspired by BERTopic.
Use
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("RJuro/SciNERTopic")
model_trf = AutoModelForTokenClassification.from_pretrained("RJuro/SciNERTopic")
nlp = pipeline("ner", model=model_trf, tokenizer=tokenizer, aggregation_strategy='average')
Cite this model
@misc {roman_jurowetzki_2022,
author = { {Roman Jurowetzki, Hamid Bekamiri} },
title = { SciNERTopic - NER enhanced transformer-based topic modelling for scientific text },
year = 2022,
url = { https://huggingface.co/RJuro/SciNERTopic },
doi = { 10.57967/hf/0095 },
publisher = { Hugging Face }
}
- Downloads last month
- 126
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.