mychen76
/

paligemma-3b-mix-448-med_30k-ct-brain

Transformers

Safetensors

Inference Endpoints

Model card Files Files and versions Community

mychen76 commited on Sep 6

Commit

55f20f4

•

1 Parent(s): fe72180

Update README.md

Browse files

Files changed (1) hide show

README.md +43 -4

README.md CHANGED Viewed

@@ -49,12 +49,16 @@ from transformers import AutoProcessor
 device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
 dtype = torch.bfloat16
-## input
 url = "https://huggingface.co/datasets/mychen76/medtrinity_brain_30k_hf/viewer/default/train?row=4&image-viewer=image-62-2B87111BBD996B48DB4C86B0244653FF84B3B8A9"
 image = Image.open(requests.get(url, stream=True).raw)
-## load model
 FINETUNED_MODEL_ID="mychen76/paligemma-3b-mix-448-med_30k-ct-brain"
 processor = AutoProcessor.from_pretrained(FINETUNED_MODEL_ID)
@@ -64,7 +68,7 @@ model = PaliGemmaForConditionalGeneration.from_pretrained(
     device_map=device
 ).eval()
 ```
-run inference
 ```
 # Instruct the model to create a caption in Spanish
 def run_inference(input_text,input_image, model, processor,max_tokens=1024):
@@ -84,11 +88,46 @@ input_text="caption"
 pred_text = run_inference(input_text,input_image,model, processor)
 print(pred_text)
 ```
-result
 ```
 The image is a CT scan of the brain, showing various brain structures without the presence of medical devices. The region of interest, located centrally and in the middle of the image, occupies approximately 3.0% of the area and appears to have an abnormal texture or density compared to the surrounding brain tissue, which may indicate a pathological condition. This abnormal area could be related to the surrounding brain structures, potentially affecting them or being affected by a shared pathological process, such as a hemorrhage or a mass effect.
 ```
 ### Direct Use

 device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
 dtype = torch.bfloat16
+```
+***input***
+```
 url = "https://huggingface.co/datasets/mychen76/medtrinity_brain_30k_hf/viewer/default/train?row=4&image-viewer=image-62-2B87111BBD996B48DB4C86B0244653FF84B3B8A9"
 image = Image.open(requests.get(url, stream=True).raw)
+```
+***load model***
+```
 FINETUNED_MODEL_ID="mychen76/paligemma-3b-mix-448-med_30k-ct-brain"
 processor = AutoProcessor.from_pretrained(FINETUNED_MODEL_ID)
     device_map=device
 ).eval()
 ```
+***run inference***
 ```
 # Instruct the model to create a caption in Spanish
 def run_inference(input_text,input_image, model, processor,max_tokens=1024):
 pred_text = run_inference(input_text,input_image,model, processor)
 print(pred_text)
 ```
+***result***
 ```
 The image is a CT scan of the brain, showing various brain structures without the presence of medical devices. The region of interest, located centrally and in the middle of the image, occupies approximately 3.0% of the area and appears to have an abnormal texture or density compared to the surrounding brain tissue, which may indicate a pathological condition. This abnormal area could be related to the surrounding brain structures, potentially affecting them or being affected by a shared pathological process, such as a hemorrhage or a mass effect.
 ```
+***Running on CUDA***
+```
+from transformers import AutoProcessor, PaliGemmaForConditionalGeneration
+from PIL import Image
+import requests
+import torch
+FINETUNED_MODEL_ID="mychen76/paligemma-3b-mix-448-med_30k-ct-brain"
+device = "cuda:0"
+dtype = torch.bfloat16
+url = "https://huggingface.co/datasets/mychen76/medtrinity_brain_30k_hf/viewer/default/train?row=4&image-viewer=image-62-2B87111BBD996B48DB4C86B0244653FF84B3B8A9"
+image = Image.open(requests.get(url, stream=True).raw)
+model = PaliGemmaForConditionalGeneration.from_pretrained(
+    FINETUNED_MODEL_ID,
+    torch_dtype=dtype,
+    device_map=device,
+    revision="bfloat16",
+).eval()
+processor = AutoProcessor.from_pretrained(FINETUNED_MODEL_ID)
+# Instruct the model to create a caption in Spanish
+prompt = "caption es"
+model_inputs = processor(text=prompt, images=image, return_tensors="pt").to(model.device)
+input_len = model_inputs["input_ids"].shape[-1]
+with torch.inference_mode():
+    generation = model.generate(**model_inputs, max_new_tokens=100, do_sample=False)
+    generation = generation[0][input_len:]
+    decoded = processor.decode(generation, skip_special_tokens=True)
+    print(decoded)
+```
 ### Direct Use