|
--- |
|
library_name: transformers |
|
pipeline_tag: image-text-to-text |
|
license: apache-2.0 |
|
datasets: |
|
- joshuachou/SkinCAP |
|
- HemanthKumarK/SKINgpt |
|
language: |
|
- en |
|
tags: |
|
- biology |
|
- skin |
|
- skin disease |
|
- cancer |
|
- medical |
|
--- |
|
# Model Card for PaliGemma Dermatology Model |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
This model, based on the PaliGemma-3B architecture, has been fine-tuned for dermatology-related image and text processing tasks. The model is designed to assist in the identification of various skin conditions using a combination of image analysis and natural language processing. |
|
|
|
|
|
- **Developed by:** Bruce_Wayne |
|
- **Model type:** vision model |
|
- **Finetuned from model:** https://huggingface.co/google/paligemma-3b-pt-224 |
|
- **LoRa Adaptors used:** Yes |
|
- **Intended use:** Medical image analysis, specifically for dermatology |
|
** |
|
### please let me know how the model works -->https://forms.gle/cBA6apSevTyiEbp46 |
|
### Thank you |
|
## Uses |
|
### Direct Use |
|
|
|
The model can be directly used for analyzing dermatology images, providing insights into potential skin conditions. |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
**Skin Tone Bias:** The model may have been trained on a dataset that does not adequately represent all skin tones, potentially leading to biased results. |
|
**Geographic Bias:** The model's performance may vary depending on the prevalence of certain conditions in different geographic regions. |
|
|
|
## How to Get Started with the Model |
|
|
|
```python |
|
|
|
import torch |
|
from transformers import AutoProcessor, PaliGemmaForConditionalGeneration |
|
from PIL import Image |
|
|
|
# Load the model and processor |
|
model_id = "brucewayne0459/paligemma_derm" |
|
processor = AutoProcessor.from_pretrained(model_id) |
|
model = PaliGemmaForConditionalGeneration.from_pretrained(model_id, device_map={"": 0}) |
|
model.eval() |
|
|
|
# Load a sample image and text input |
|
input_text = "Identify the skin condition?" |
|
input_image_path = " Replace with your actual image path" |
|
input_image = Image.open(input_image_path).convert("RGB") |
|
|
|
# Process the input |
|
inputs = processor(text=input_text, images=input_image, return_tensors="pt", padding="longest").to("cuda" if torch.cuda.is_available() else "cpu") |
|
|
|
# Set the maximum length for generation |
|
max_new_tokens = 50 |
|
|
|
# Run inference |
|
with torch.no_grad(): |
|
outputs = model.generate(**inputs, max_new_tokens=max_new_tokens) |
|
|
|
# Decode the output |
|
decoded_output = processor.decode(outputs[0], skip_special_tokens=True) |
|
print("Model Output:", decoded_output) |
|
``` |
|
## Training Details |
|
|
|
### Training Data |
|
|
|
The model was fine-tuned on a dataset of dermatological images combined with disease names |
|
|
|
### Training Procedure |
|
|
|
The model was fine-tuned using LoRA (Low-Rank Adaptation) for more efficient training. Mixed precision (bfloat16) was used to speed up training and reduce memory usage. |
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** Mixed precision (bfloat16) |
|
- **Epochs:** 10 |
|
- **Learning rate:** 2e-5 |
|
- **Batch size:** 6 |
|
- **Gradient accumulation steps:** 4 |
|
|
|
|
|
## Evaluation |
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Testing Data |
|
|
|
The model was evaluated on a separate validation set of dermatological images and Disease Names, distinct from the training data. |
|
|
|
#### Metrics |
|
- **Validation Loss:** The loss was tracked throughout the training process to evaluate model performance. |
|
- **Accuracy:** The primary metric for assessing model predictions. |
|
### Results |
|
|
|
The model achieved a final validation loss of approximately 0.2214, indicating reasonable performance in predicting skin conditions based on the dataset used. |
|
|
|
#### Summary |
|
|
|
|
|
## Environmental Impact |
|
|
|
|
|
- **Hardware Type:** 1 x L4 GPU |
|
- **Hours used:** ~22 HOURS |
|
- **Cloud Provider:** LIGHTNING AI |
|
- **Compute Region:** USA |
|
- **Carbon Emitted:** 0.9 kg eq. CO2 |
|
|
|
## Technical Specifications |
|
|
|
### Model Architecture and Objective |
|
|
|
- **Architecture:** Vision-Language model based on PaliGemma-3B |
|
- **Objective:** To classify and diagnose dermatological conditions from images and text |
|
|
|
### Compute Infrastructure |
|
|
|
#### Hardware |
|
|
|
- **GPU:** 1xL4 GPU |
|
## Model Card Authors |
|
Bruce_Wayne |