--- datasets: - links-ads/gaia-vineyard-uav-dataset pipeline_tag: image-segmentation tags: - smart agriculture - smart viticulture - segformer - transformer - vineyard - grapevine license: mit --- # GRowSeg Grapevine Rows Segmentation (GRowSeg) The paper will be published soon. NEW: check out our [demo](https://huggingface.co/spaces/links-ads/gaia-growseg-demo)! # Table of Contents 1. [Description and use cases](#description-and-use-cases) 2. [Model](#model) 3. [Ipnut](#input) 4. [Preprocessing](#preprocessing) 5. [Output](#output) 6. [Postprocessing](#postprocessing) 7. [Dataset and training details](#dataset-and-training-details) 8. [How to run GRowSeg](#how-to-run-growseg) 9. [References](#references) 10. [Contributors](#contributors) 11. [License](#license) ## Description and use cases GRowSeg is a deep learning model for segmenting grapevine rows in UAV-acquired RGB images of vineyards. It takes an RGB orthoimage as input and predicts a binary segmentation mask, i.e. an image with '1' for rows and '0' for background. ## Model GRowSeg is a Segformer-b5 model. To allow comparison with previous state of the art (https://arxiv.org/pdf/2108.01200), the same experiments are performed. In particular, we repeat experiments from T1 to T4 of the reference paper with GRowSeg. Results are reported in the following table, in terms of F1 score. |Model |T1 |T2 |T3 |T4 | |--------------|:----|:----|:----|:----| |SegNet |0.73 |**0.85** |0.85 |0.76 | |UNet |0.75 |0.82 |**0.91** |0.75 | |ModSegNet |0.75 |0.83 |0.89 |0.76 | |**GRowSeg** |**0.78** |**0.85** |**0.91** |**0.78** | ### Input GRowSeg pipeline expects RGB input orthoimages in the uint8 format. The range of supported ground sampling distance (GSD) values is approximately [0.75, 10] cm/px. ### Preprocessing The input image, which has size HxWx3, is resized with the given ``scaling_factor``, minimally padded compatibly with given ``patch_size`` and ``stride``, scaled to [0, 1] and finally normalized with ImageNet mean and std. A moving window mechanism extract image tiles with size ``patch_size`` and overlapping with ``stride`` pixels, forming batches of ``batch_size`` image tiles. Each batch has therefore shape ``int[batch_size, 3, patch_size, patch_size]`` ### Output Given a batch, the model outputs a pixel-wise confidence score map for each tile: each value represents the confidence of assigning that pixel to the 'vine' (1) class. The output batch has thus shape ``int[batch_size, 1, patch_size, patch_size]`` ### Postprocessing Tiles are merged back together, averaging overlapping confidence scores (if ``stride != patch_size``). The merged confidence score map is squeezed, unpadded and resized back to the original resolution HxW of the input orthoimage. A simple threshold at 0.5 is performed to convert the confidence score map to a binary segmentation mask. This mask is finally saved to the specified output path. If the input image is a georeferenced TIFF, the saved mask will be a georeferenced TIFF too. ## Dataset and training details The datasets used for training and testing GRowSeg can be found at: * [Group A orthoimages](https://github.com/Cybonic/DL_vineyard_segmentation_study) (request it to the owner of the repo) * [Group B orthoimages](https://huggingface.co/datasets/links-ads/grapedrone-dataset/tree/main) ## How to run GRowSeg * First, clone this repository: ``` git lfs install git clone git@hf.co:links-ads/vitigeoss-growseg ``` * Then, create a Python virtual environment and install dependencies: ``` python -m venv .venv source .venv/bin/activate pip install -r requirements.txt ``` Finally, to run GRowSeg on an input image: ``` python main.py "/path/to/input_image.tif" "/path/to/output_mask.tif" ``` The output path where to save the output mask must always be specified. The output filename must match the extension of the input filename. Supported image formats are .tif (preferred), .png, .jpg. Several options can be specified: * ``--patch_size``: the resolution of the tiles extracted by the moving window (default: 512) * ``--stride``: the stride of the moving window (default: 256) * ``--scaling_factor``: scaling factor for resizing the image (default: 1.0) * ``--rotate``: perform inference also on 90°, 180°, 270°-rotated tiles, to enhance robustness at the cost of increased compute (default: False) * ``--batch_size``: batch size for the inference (default: 16) * ``--verbose``: tracks the inference with a progress bar (default: False) Caveat: * since GRowSeg was trained with ``patch_size = 512``, it is suggested to leave it as default. * the optimal GSD range for GrowSeg is [1, 1.5] cm/px. Therefore, you may want to rescale your image based on its gsd, by setting ``scaling_factor`` e.g. to GSD / 1.5 * GRowSeg automatically uses a GPU, if one is available on your PC. If this is the case, you may want to increase the ``batch_size`` to speed up inference ## References * **GRowSeg** presented in the [Official GRowSeg repo](https://github.com/links-ads/TODO) ## Contributors * [tommonopolinks](https://github.com/tommonopolinks) (LINKS Foundation) * [FedericOldani](https://github.com/FedericOldani) (LINKS Foundation) ## License MIT License