adding code and instructions

Files changed (10) hide show

.gitignore +6 -0
README.md +23 -1
notebooks/0_preprocess.ipynb +124 -0
notebooks/1_segment_preprocessed.ipynb +217 -0
samples/fundus/original/CHASEDB1_08L.png +0 -0
samples/fundus/original/CHASEDB1_12R.png +0 -0
samples/fundus/original/DRIVE_22.png +0 -0
samples/fundus/original/DRIVE_40.png +0 -0
samples/fundus/original/HRF_04_g.jpg +3 -0
samples/fundus/original/HRF_07_dr.jpg +3 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,6 @@

+*.pyc
+__pycache__
+*.egg-info
+*.zip
+/samples/fundus/*
+!/samples/fundus/original

README.md CHANGED Viewed

@@ -4,4 +4,26 @@ pipeline_tag: image-segmentation
 tags:
 - medical
 - biology
----

 tags:
 - medical
 - biology
+---
+## VascX models
+This repository contains the instructions for using the VascX models from the paper [VascX Models: Model Ensembles for Retinal Vascular Analysis from Color Fundus Images](https://arxiv.org/abs/2409.16016).
+The model weights are in [huggingface](https://huggingface.co/Eyened/vascx).
+### Installation
+To install the entire fundus analysis pipeline including fundus preprocessing, model inference code and vascular biomarker extraction:
+1. Create a conda or virtualenv virtual environment, or otherwise ensure a clean environment.
+2. Install the [rtnls_inference package](https://github.com/Eyened/retinalysis-inference).
+### Usage
+To speed up re-execution of vascx we recommend to run the preprocessing and segmentation steps separately:
+1. Preprocessing. See [this notebook](./notebooks/0_preprocess.ipynb). This step is CPU-heavy and benefits from parallelization (see notebook).
+2. Inference. See [this notebook](./notebooks/1_segment_preprocessed.ipynb). All models can be ran in a single GPU with >10GB VRAM.

notebooks/0_preprocess.ipynb ADDED Viewed

	@@ -0,0 +1,124 @@

+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from pathlib import Path\n",
+    "\n",
+    "import pandas as pd\n",
+    "\n",
+    "from rtnls_fundusprep.utils import preprocess_for_inference"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Preprocessing\n",
+    "\n",
+    "This code will preprocess the images and write .png files with the square fundus image and the contrast enhanced version\n",
+    "\n",
+    "This step is not strictly necessary, but it is useful if you want to run the preprocessing step separately before model inference\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Create a list of files to be preprocessed:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ds_path = Path(\"../samples/fundus\")\n",
+    "files = list((ds_path / \"original\").glob(\"*\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Images with .dcm extension will be read as dicom and the pixel_array will be read as RGB. All other images will be read using PIL's Image.open"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "0it [00:00, ?it/s][Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.\n",
+      "6it [00:00, 143.58it/s]\n",
+      "[Parallel(n_jobs=4)]: Done   2 out of   6 | elapsed:    2.1s remaining:    4.2s\n",
+      "[Parallel(n_jobs=4)]: Done   3 out of   6 | elapsed:    2.1s remaining:    2.1s\n",
+      "[Parallel(n_jobs=4)]: Done   4 out of   6 | elapsed:    2.9s remaining:    1.4s\n",
+      "[Parallel(n_jobs=4)]: Done   6 out of   6 | elapsed:    4.3s finished\n"
+     ]
+    }
+   ],
+   "source": [
+    "bounds = preprocess_for_inference(\n",
+    "    files,  # List of image files\n",
+    "    rgb_path=ds_path / \"rgb\",  # Output path for RGB images\n",
+    "    ce_path=ds_path / \"ce\",  # Output path for Contrast Enhanced images\n",
+    "    n_jobs=4,  # number of preprocessing workers\n",
+    ")\n",
+    "df_bounds = pd.DataFrame(bounds).set_index(\"id\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The preprocessor will produce RGB and contrast-enhanced preprocessed images cropped to a square and return a dataframe with the image bounds that can be used to reconstruct the original image. Output files will be named the same as input images, but with .png extension. Be careful with providing multiple inputs with the same filename without extension as this will result in over-written images. Any exceptions during pre-processing will not stop execution but will print error. Images that failed pre-processing for any reason will be marked with `success=False` in the df_bounds dataframe."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df_bounds.to_csv(ds_path / \"meta.csv\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "base",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}

notebooks/1_segment_preprocessed.ipynb ADDED Viewed

	@@ -0,0 +1,217 @@

+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from pathlib import Path\n",
+    "\n",
+    "import torch\n",
+    "\n",
+    "from rtnls_inference import (\n",
+    "    HeatmapRegressionEnsemble,\n",
+    "    SegmentationEnsemble,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Segmentation of preprocessed images\n",
+    "\n",
+    "Here we segment images preprocessed using 0_preprocess.ipynb\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ds_path = Path(\"../samples/fundus\")\n",
+    "\n",
+    "# input folders. these are the folders where we stored the preprocessed images\n",
+    "rgb_path = ds_path / \"rgb\"\n",
+    "ce_path = ds_path / \"ce\"\n",
+    "\n",
+    "# these are the output folders for:\n",
+    "av_path = ds_path / \"av\"  # artery-vein segmentations\n",
+    "discs_path = ds_path / \"discs\"  # optic disc segmentations\n",
+    "overlays_path = ds_path / \"overlays\"  # optional overlay visualizations\n",
+    "\n",
+    "device = torch.device(\"cuda:0\")  # device to use for inference"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "rgb_paths = sorted(list(rgb_path.glob(\"*.png\")))\n",
+    "ce_paths = sorted(list(ce_path.glob(\"*.png\")))\n",
+    "paired_paths = list(zip(rgb_paths, ce_paths))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "paired_paths[0]  # important to make sure that the paths are paired correctly"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Artery-vein segmentation\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "av_ensemble = SegmentationEnsemble.from_huggingface('Eyened/vascx:artery_vein/av_july24.pt').to(device)\n",
+    "\n",
+    "av_ensemble.predict_preprocessed(paired_paths, dest_path=av_path, num_workers=2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Disc segmentation\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "disc_ensemble = SegmentationEnsemble.from_huggingface('Eyened/vascx:disc/disc_july24.pt').to(device)\n",
+    "disc_ensemble.predict_preprocessed(paired_paths, dest_path=discs_path, num_workers=2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Fovea detection\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fovea_ensemble = HeatmapRegressionEnsemble.from_huggingface('Eyened/vascx:fovea/fovea_july24.pt').to(device)\n",
+    "# note: this model does not use contrast enhanced images\n",
+    "df = fovea_ensemble.predict_preprocessed(paired_paths, num_workers=2)\n",
+    "df.columns = [\"mean_x\", \"mean_y\"]\n",
+    "df.to_csv(ds_path / \"fovea.csv\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Plotting the retinas (optional)\n",
+    "\n",
+    "This will only work if you ran all the models and stored the outputs using the same folder/file names as above\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from vascx.fundus.loader import RetinaLoader\n",
+    "\n",
+    "from rtnls_enface.utils.plotting import plot_gridfns\n",
+    "\n",
+    "loader = RetinaLoader.from_folder(ds_path)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plot_gridfns([ret.plot for ret in loader[:6]])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Storing visualizations (optional)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "if not overlays_path.exists():\n",
+    "    overlays_path.mkdir()\n",
+    "for ret in loader:\n",
+    "    fig, _ = ret.plot()\n",
+    "    fig.savefig(overlays_path / f\"{ret.id}.png\", bbox_inches=\"tight\", pad_inches=0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "retinalysis",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}

samples/fundus/original/CHASEDB1_08L.png ADDED Viewed

samples/fundus/original/CHASEDB1_12R.png ADDED Viewed

samples/fundus/original/DRIVE_22.png ADDED Viewed

samples/fundus/original/DRIVE_40.png ADDED Viewed

samples/fundus/original/HRF_04_g.jpg ADDED Viewed

Git LFS Details

SHA256: fc9ed13ef42502eeecb3f1754dc0d3b72a454c82884b40dde934e8a516495588
Pointer size: 132 Bytes
Size of remote file: 1.9 MB

samples/fundus/original/HRF_07_dr.jpg ADDED Viewed

Git LFS Details

SHA256: 203ddec480816b6c9d7ea3c19c1ff0870a5a61b5b6c9a176300402ac47fbc10f
Pointer size: 131 Bytes
Size of remote file: 921 kB