--- language: "english" tags: license: "mit" datasets: - race - ai2_arc - openbookqa metrics: - accuracy --- # Roberta Large Fine Tuned on RACE ## Model description This model follows the implementation by Allen AI team about [Aristo Roberta V7 Model](https://leaderboard.allenai.org/arc/submission/blcotvl7rrltlue6bsv0) given in [ARC Challenge](https://leaderboard.allenai.org/arc/submissions/public) #### How to use ```python import datasets from transformers import RobertaTokenizer from transformers import RobertaForMultipleChoice tokenizer = RobertaTokenizer.from_pretrained( "LIAMF-USP/aristo-roberta") model = RobertaForMultipleChoice.from_pretrained( "LIAMF-USP/aristo-roberta") dataset = datasets.load_dataset( "arc",, split=["train", "validation", "test"], ) training_examples = dataset[0] evaluation_examples = dataset[1] test_examples = dataset[2] example=training_examples[0] example_id = example["example_id"] question = example["question"] label_example = example["answer"] options = example["options"] if label_example in ["A", "B", "C", "D", "E"]: label_map = {label: i for i, label in enumerate( ["A", "B", "C", "D", "E"])} elif label_example in ["1", "2", "3", "4", "5"]: label_map = {label: i for i, label in enumerate( ["1", "2", "3", "4", "5"])} else: print(f"{label_example} not found") while len(options) < 5: empty_option = {} empty_option['option_context'] = '' empty_option['option_text'] = '' options.append(empty_option) choices_inputs = [] for ending_idx, option in enumerate(options): ending = option["option_text"] context = option["option_context"] if question.find("_") != -1: # fill in the banks questions question_option = question.replace("_", ending) else: question_option = question + " " + ending inputs = tokenizer( context, question_option, add_special_tokens=True, max_length=MAX_SEQ_LENGTH, padding="max_length", truncation=True, return_overflowing_tokens=False, ) if "num_truncated_tokens" in inputs and inputs["num_truncated_tokens"] > 0: logging.warning(f"Question: {example_id} with option {ending_idx} was truncated") choices_inputs.append(inputs) label = label_map[label_example] input_ids = [x["input_ids"] for x in choices_inputs] attention_mask = ( [x["attention_mask"] for x in choices_inputs] # as the senteces follow the same structure, just one of them is # necessary to check if "attention_mask" in choices_inputs[0] else None ) example_encoded = { "example_id": example_id, "input_ids": input_ids, "attention_mask": attention_mask, "token_type_ids": token_type_ids, "label": label } output = model(**example_encoded) ``` ## Training data the Training data was the same as proposed [here](https://leaderboard.allenai.org/arc/submission/blcotvl7rrltlue6bsv0) The only diferrence was the hypeparameters of RACE fine tuned model, which were reported [here](https://huggingface.co/LIAMF-USP/roberta-large-finetuned-race#eval-results) ## Training procedure It was necessary to preprocess the data with a method that is exemplified for a single instance in the _How to use_ section. The used hyperparameters were the following: | Hyperparameter | Value | |:----:|:----:| | adam_beta1 | 0.9 | | adam_beta2 | 0.98 | | adam_epsilon | 1.000e-8 | | eval_batch_size | 16 | | train_batch_size | 4 | | fp16 | True | | gradient_accumulation_steps | 4 | | learning_rate | 0.00001 | | warmup_steps | 0.06 | | max_length | 256 | | epochs | 4 | The other parameters were the default ones from [Trainer](https://huggingface.co/transformers/main_classes/trainer.html) and [Trainer Arguments](https://huggingface.co/transformers/main_classes/trainer.html#trainingarguments) ## Eval results: | Dataset Acc | Challenge Test | |:----:|:----:| | | 65.358 | **The model was trained with a TITAN RTX**