metadata

license: mit
pipeline_tag: text-classification

roberta-nei-fact-check

This is a machine learning model trained for text classification using the Roberta architecture and a tokenizer. The purpose of this model is to identify whether a given claim with evidence contains enough information to make a fact-checking decision.

Model Details

The model was trained using the Adam optimizer with a learning rate of 2-4e, an epsilon of 1-8, and a weight decay of 2-8e. The training data consisted mainly of the Fever and Hover datasets, with a small sample of created data. The model returns two labels:

0: Enough information
1: Not enough information

The model uses a tokenizer for text classification and requires input in the form of a claim with evidence. This means that the input should be a text string containing both the claim and the evidence to provide best result.

Usage

To use this model, you can load it into your Python code using a library such as PyTorch or TensorFlow. You can then pass in a claim with evidence string and the model will return a label indicating whether there is enough information in the claim with evidence for fact-checking.

Here is an example of how to use the model in PyTorch:

import torch
from transformers import RobertaTokenizer, RobertaForSequenceClassification

# Load the tokenizer and model
tokenizer = RobertaTokenizer.from_pretrained('Dzeniks/roberta-nei-fact-check')
model = RobertaForSequenceClassification.from_pretrained('Dzeniks/roberta-nei-fact-check')

# Define the claim with evidence to classify
claim = "Albert Einstein work in the field of computer science"
evidence = "Albert Einstein was a German-born theoretical physicist, widely acknowledged to be one of the greatest and most influential physicists of all time."

# Tokenize the claim with evidence
x = tokenizer.encode_plus(claim, evidence, return_tensors="pt")

model.eval()
with torch.no_grad():
  prediction = model(**x)

label = torch.argmax(outputs[0]).item()

print(f"Label: {label}")

In this example, the claim_with_evidence variable contains the claim with evidence to classify. The claim with evidence is tokenized using the tokenizer and converted to a tensor. The model is then used to classify the claim with evidence and the resulting label is printed to the console.