MedVersa: A Generalist Learner for Multifaceted Medical Image Interpretation
The model card for our paper A Generalist Learner for Multifaceted Medical Image Interpretation .
MedVersa is a compound medical AI system that can coordinate multimodal inputs, orchestrate models and tools for varying tasks, and generate multimodal outputs.
Environment
MedVersa is written in Python. It is recommended to configure/manage your python environment using conda. To do this, you need to install the miniconda or anaconda first.
After installing conda, you need to set up a new conda environment for MedVersa using the provided environment.yml
:
conda env create -f environment.yml
conda activate medversa
The above environment.yml
has been validated on NVIDIA A100 GPUs. If you have more advanced cards, e.g., NVIDIA H100 GPUs, you may need environment_h100.yml
which supports CUDA 11.8:
conda env create -f environment_cu118.yml
conda activate medversa
If you encounter an issue of opencv, you may need to reinstall opencv-python:
pip install opencv-contrib-python
If you meet a problem of incompatible torchvision version
, try the following:
pip install torchvision==0.15.2+cu118 --index-url https://download.pytorch.org/whl/cu118
Inference
from utils import *
from torch import cuda
# --- Launch Model ---
device = 'cuda' if cuda.is_available() else 'cpu'
model_cls = registry.get_model_class('medomni') # medomni is the architecture name :)
model = model_cls.from_pretrained('hyzhou/MedVersa').to(device).eval()
# --- Define examples ---
examples = [
[
["./demo_ex/c536f749-2326f755-6a65f28f-469affd2-26392ce9.png"],
"Age:30-40.\nGender:F.\nIndication: ___-year-old female with end-stage renal disease not on dialysis presents with dyspnea. PICC line placement.\nComparison: None.",
"How would you characterize the findings from <img0>?",
"cxr",
"report generation",
],
]
# --- Define hyperparams ---
num_beams = 1
do_sample = True
min_length = 1
top_p = 0.9
repetition_penalty = 1
length_penalty = 1
temperature = 0.1
# --- Generate a report for a chest X-ray image ---
index = 0
demo_ex = examples[index]
images, context, prompt, modality, task = demo_ex[0], demo_ex[1], demo_ex[2], demo_ex[3], demo_ex[4]
seg_mask_2d, seg_mask_3d, output_text = generate_predictions(model, images, context, prompt, modality, task, num_beams, do_sample, min_length, top_p, repetition_penalty, length_penalty, temperature)
print(output_text)
For more details and examples, please refer to inference.py
.
Demo
CUDA_VISIBLE_DEVICES=0 python demo.py --cfg-path medversa.yaml
Prompts
More prompts can be found in medomni/datasets/prompts.json
.
- Downloads last month
- 73