paligemma_derm / README.md
brucewayne0459's picture
Update README.md
25ba2ee verified
---
library_name: transformers
pipeline_tag: image-text-to-text
license: apache-2.0
datasets:
- joshuachou/SkinCAP
- HemanthKumarK/SKINgpt
language:
- en
tags:
- biology
- skin
- skin disease
- cancer
- medical
---
# Model Card for PaliGemma Dermatology Model
## Model Details
### Model Description
This model, based on the PaliGemma-3B architecture, has been fine-tuned for dermatology-related image and text processing tasks. The model is designed to assist in the identification of various skin conditions using a combination of image analysis and natural language processing.
- **Developed by:** Bruce_Wayne
- **Model type:** vision model
- **Finetuned from model:** https://huggingface.co/google/paligemma-3b-pt-224
- **LoRa Adaptors used:** Yes
- **Intended use:** Medical image analysis, specifically for dermatology
**
### please let me know how the model works -->https://forms.gle/cBA6apSevTyiEbp46
### Thank you
## Uses
### Direct Use
The model can be directly used for analyzing dermatology images, providing insights into potential skin conditions.
## Bias, Risks, and Limitations
**Skin Tone Bias:** The model may have been trained on a dataset that does not adequately represent all skin tones, potentially leading to biased results.
**Geographic Bias:** The model's performance may vary depending on the prevalence of certain conditions in different geographic regions.
## How to Get Started with the Model
```python
import torch
from transformers import AutoProcessor, PaliGemmaForConditionalGeneration
from PIL import Image
# Load the model and processor
model_id = "brucewayne0459/paligemma_derm"
processor = AutoProcessor.from_pretrained(model_id)
model = PaliGemmaForConditionalGeneration.from_pretrained(model_id, device_map={"": 0})
model.eval()
# Load a sample image and text input
input_text = "Identify the skin condition?"
input_image_path = " Replace with your actual image path"
input_image = Image.open(input_image_path).convert("RGB")
# Process the input
inputs = processor(text=input_text, images=input_image, return_tensors="pt", padding="longest").to("cuda" if torch.cuda.is_available() else "cpu")
# Set the maximum length for generation
max_new_tokens = 50
# Run inference
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=max_new_tokens)
# Decode the output
decoded_output = processor.decode(outputs[0], skip_special_tokens=True)
print("Model Output:", decoded_output)
```
## Training Details
### Training Data
The model was fine-tuned on a dataset of dermatological images combined with disease names
### Training Procedure
The model was fine-tuned using LoRA (Low-Rank Adaptation) for more efficient training. Mixed precision (bfloat16) was used to speed up training and reduce memory usage.
#### Training Hyperparameters
- **Training regime:** Mixed precision (bfloat16)
- **Epochs:** 10
- **Learning rate:** 2e-5
- **Batch size:** 6
- **Gradient accumulation steps:** 4
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
The model was evaluated on a separate validation set of dermatological images and Disease Names, distinct from the training data.
#### Metrics
- **Validation Loss:** The loss was tracked throughout the training process to evaluate model performance.
- **Accuracy:** The primary metric for assessing model predictions.
### Results
The model achieved a final validation loss of approximately 0.2214, indicating reasonable performance in predicting skin conditions based on the dataset used.
#### Summary
## Environmental Impact
- **Hardware Type:** 1 x L4 GPU
- **Hours used:** ~22 HOURS
- **Cloud Provider:** LIGHTNING AI
- **Compute Region:** USA
- **Carbon Emitted:** 0.9 kg eq. CO2
## Technical Specifications
### Model Architecture and Objective
- **Architecture:** Vision-Language model based on PaliGemma-3B
- **Objective:** To classify and diagnose dermatological conditions from images and text
### Compute Infrastructure
#### Hardware
- **GPU:** 1xL4 GPU
## Model Card Authors
Bruce_Wayne