library_name: transformers
license: apache-2.0
datasets:
- raidium/ECNQA_generated_questions
- raidium/ECN-QA
language:
- en
metrics:
- accuracy
tags:
- medical
base_model: stanford-crfm/BioMedLM
Model Card for Raidium MQG model
The model is introduced in the paper "Efficient Medical Question Answering with Knowledge-Augmented Question Generation".
Paper: https://arxiv.org/abs/2405.14654
MQG is is a transformer language model pre-trained on a series of medical textbooks, and medical questions generated by GPT-4. The weights are initialized with BioMedLM, then further pre-trained on those datasets.
The questions have been generated from prompt containing medical data from the textbooks. They are available here: ECNQA_generated_questions.
MQG is designed to be fine-tuned for Medical Question Answering tasks.
Model Details
Model Description
In the expanding field of language model applications, medical knowledge representation remains a significant challenge due to the specialized nature of the domain. Large language models, such as GPT-4, obtain reasonable scores on medical question answering tasks, but smaller models are far behind. In this work, we introduce a method to improve the proficiency of a small language model in the medical domain by employing a two-fold approach. We first fine-tune the model on a corpus of medical textbooks. Then, we use GPT-4 to generate questions similar to the downstream task, prompted with textbook knowledge, and use them to fine-tune the model. We show the benefits of our training strategy on a medical answering question dataset.
Using the model
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("raidium/MQG")
model = AutoModelForCausalLM.from_pretrained("raidium/MQG")
- Developed by: Raidium
- Model type: Transformer
- License: Aopache 2.0
- Finetuned from model: BioMedLM
Model Sources [optional]
- Repository: [https://github.com/raidium-med/MQG]
- Paper: https://arxiv.org/abs/2405.14654
Uses
Direct Use
MQG is trained using next-token-prediction on generated questions. Therefore, it can be used out-of-the-box to generate potential answers for medical question answering tasks. However, the generated questions might contain some errors, so it is advised to fine-tune the model on your dataset, and use the models to rank the potential answers.
Downstream Use
MQG can be fine-tuned for Medical Question Answering tasks. For multiple choice questions, a classification head should be appended at the end of the model, to rank different proposed answers.
Out-of-Scope Use
This model should not be used for datasets outside medical tasks.
Bias, Risks, and Limitations
There is no guarantee that the model answers medical questions correctly. It should only be used for academic purposes, and not in clinical care.
Training Details
Training Data
The model is trained on a corpus of medical textbooks, and further pre-trained on generated questions: ECNQA_generated_questions.
Training Procedure
MGQ is trained using next-token-prediction on both datasets.
Training Hyperparameters
- Training regime: fp16 mixed-precision training.
Evaluation
Testing Data, Factors & Metrics
Testing Data
We tested the model on a medical question answering dataset, ECN-QA, based on the french medical residency examination. It is composed of "single" and "progressive" questions (i.e a serie of multiple related questions). It is a multiple-choice question dataset, containing 5 propositions for each question.
Metrics
We use the accuracy to evaluate the model on Medical Question Answering.
Results
See paper: https://arxiv.org/abs/2405.14654
Model Architecture and Objective
The model is based on BioMedLM's architecture, which is modified from GPT-2 architecture.
Compute Infrastructure
Hardware
The model was trained on the Jean-Zay supercomputer, on multiple nodes with 4 A100 gpus.
Software
Pytorch, DeepSpeed
Citation
BibTeX:
@article{khlaut2024efficient,
title={Efficient Medical Question Answering with Knowledge-Augmented Question Generation},
author={Khlaut, Julien and Dancette, Corentin and Ferreres, Elodie and Bennani, Alaedine and H{\'e}rent, Paul and Manceron, Pierre},
journal={Clinical NLP Workshop, NAACL 2024},
year={2024}
}
Model Card Contact
julien.khlaut at raidium.fr