Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

SciBERT-NLI

This is the model SciBERT [1] fine-tuned on the SNLI and the MultiNLI datasets using the sentence-transformers library to produce universal sentence embeddings [2].

The model uses the original scivocab wordpiece vocabulary and was trained using the average pooling strategy and a softmax loss.

Base model: allenai/scibert-scivocab-cased from HuggingFace's AutoModel.

Training time: ~4 hours on the NVIDIA Tesla P100 GPU provided in Kaggle Notebooks.

Parameters:

Parameter Value
Batch size 64
Training steps 20000
Warmup steps 1450
Lowercasing True
Max. Seq. Length 128

Performances: The performance was evaluated on the test portion of the STS dataset using Spearman rank correlation and compared to the performances of a general BERT base model obtained with the same procedure to verify their similarity.

Model Score
scibert-nli (this) 74.50
bert-base-nli-mean-tokens[3] 77.12

An example usage for similarity-based scientific paper retrieval is provided in the Covid Papers Browser repository.

References:

[1] I. Beltagy et al, SciBERT: A Pretrained Language Model for Scientific Text

[2] A. Conneau et al., Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

[3] N. Reimers et I. Gurevych, Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Downloads last month
168
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.