--- license: mit pipeline_tag: text-classification --- # roberta-nei-fact-check This is a machine learning model trained for text classification using the Roberta architecture and a tokenizer. The purpose of this model is to identify whether a given claim with evidence contains enough information to make a fact-checking decision. ## Model Details The model was trained using the Adam optimizer with a learning rate of 2-4e, an epsilon of 1-8, and a weight decay of 2-8e. The training data consisted mainly of the Fever and Hover datasets, with a small sample of created data. The model returns two labels: - 0: Enough information - 1: Not enough information The model uses a tokenizer for text classification and requires input in the form of a claim with evidence. This means that the input should be a text string containing both the claim and the evidence to provide best result. ## Usage To use this model, you can load it into your Python code using a library such as PyTorch or TensorFlow. You can then pass in a claim with evidence string and the model will return a label indicating whether there is enough information in the claim with evidence for fact-checking. Here is an example of how to use the model in PyTorch: ```python import torch from transformers import RobertaTokenizer, RobertaForSequenceClassification # Load the tokenizer and model tokenizer = RobertaTokenizer.from_pretrained('Dzeniks/roberta-nei-fact-check') model = RobertaForSequenceClassification.from_pretrained('Dzeniks/roberta-nei-fact-check') # Define the claim with evidence to classify claim = "Albert Einstein work in the field of computer science" evidence = "Albert Einstein was a German-born theoretical physicist, widely acknowledged to be one of the greatest and most influential physicists of all time." # Tokenize the claim with evidence x = tokenizer.encode_plus(claim, evidence, return_tensors="pt") model.eval() with torch.no_grad(): prediction = model(**x) label = torch.argmax(outputs[0]).item() print(f"Label: {label}") ``` In this example, the claim_with_evidence variable contains the claim with evidence to classify. The claim with evidence is tokenized using the tokenizer and converted to a tensor. The model is then used to classify the claim with evidence and the resulting label is printed to the console.