Model Description

This is a fine-tuned version of the Minerva model, trained on the Medical Meadow Flashcard Dataset for question answering. The model was developed by the Sapienza NLP Team in collaboration with Future Artificial Intelligence Research (FAIR) and CINECA; specifically, I used the version with 350 million parameters due to computational limits, though versions with 1 billion and 3 billion parameters also exist. For more details, please refer to their repositories: Sapienza NLP on Hugging Face and Minerva LLMs.

Issues and possible Solutions

In the original fine-tuned version, my model tended to generate answers that continued unnecessarily, leading to repeated sentences and a degradation in quality over time. Parameters like 'max_length' or 'max_new_tokens' were ineffective as they merely stopped the generation at a specified point without properly concluding the sentence. To address this issue, I redefined the stopping criteria to terminate the generation at the first period ('.'), as demonstrated in the code below:

class newStoppingCriteria(StoppingCriteria):

  def __init__(self, stop_word):
      self.stop_word = stop_word

  def __call__(self, input_ids, scores, **kwargs):

      decoded_text = tokenizer.decode(input_ids[0], skip_special_tokens=True)
      return self.stop_word in decoded_text


criteria = newStoppingCriteria(stop_word = ".")
stoppingCriteriaList = StoppingCriteriaList([criteria])

Since the preprocessed text was formatted as "BoS token - Question - EoS token - BoS token - Answer - EoS token," the model generated answers that included the question as well. To resolve this, I implemented a method to remove the question from the generated text, leaving only the answer:

outputText = tokenizer.decode(output_ids[0], skip_special_tokens = True)
inputText = tokenizer.decode(inputEncoding.input_ids[0], skip_special_tokens = True)
answer = outputText[len(inputText):].strip()

Use Example

  question = 'What causes Wernicke encephalopathy?'

  inputEncoding = tokenizer(question, return_tensors = 'pt').to('cuda')
  output_ids = model.generate(
    
      inputEncoding.input_ids, 
      max_length = 128, 
      do_sample = True, 
      temperature = 0.7, 
      top_p = 0.97, 
      top_k = 2, 
      pad_token_id = tokenizer.eos_token_id,
      repetition_penalty = 1.2,
      stopping_criteria = stoppingCriteriaList  
  )

  outputText = tokenizer.decode(output_ids[0], skip_special_tokens = True)
  inputText = tokenizer.decode(inputEncoding.input_ids[0], skip_special_tokens = True)
  answer = outputText[len(inputText):].strip()

  # Generated Answer: Wernicke encephalopathy is caused by a defect in the Wern-Herxheimer reaction, which leads to an accumulation of acid and alkaline phosphatase activity.
  # Effective Answer: The underlying pathophysiologic cause of Wernicke encephalopathy is thiamine (B1) deficiency.

Training Information

The model was fine-tuned for 3 epochs using the parameters specified in its original repository:

  trainingArgs = TrainingArguments(

    output_dir = "MedicalFlashcardsMinerva",
    evaluation_strategy = "steps",
    save_strategy = "steps",
    learning_rate = 2e-4,
    per_device_train_batch_size = 6,
    per_device_eval_batch_size = 6,
    gradient_accumulation_steps = 8,
    num_train_epochs = 3,
    lr_scheduler_type = "cosine",
    warmup_ratio = 0.1,
    adam_beta1 = 0.9,
    adam_beta2 = 0.95,
    adam_epsilon = 1e-8,
    weight_decay = 0.01,
    logging_steps = 100,
    report_to = "none",

    )

FabioS08
/

MedicalFlashcardsMinerva

Model Description

Issues and possible Solutions

Use Example

Training Information

Dataset used to train FabioS08/MedicalFlashcardsMinerva