SegFormer-Parker

This model is a fine-tuned version of nvidia/segformer-b5-finetuned-cityscapes-1024-1024 on a custom Parker dataset. It is designed for semantic segmentation with 4 classes.

Model description

SegFormer is a simple, efficient yet powerful semantic segmentation framework that unifies Transformers with lightweight MLP decoders. The model consists of a hierarchical Transformer encoder and a lightweight MLP decoder.

This specific model was fine-tuned on a custom Parker dataset containing 4 classes.

Intended uses & limitations

You can use this model for semantic segmentation on images similar to the Parker dataset. The model expects RGB images with a resolution of 512x512 pixels.

How to use

Here is how to use this model with the Transformers library:

from transformers import SegformerForSemanticSegmentation, SegformerImageProcessor
import torch
import cv2
import numpy as np

# Load model and processor
model = SegformerForSemanticSegmentation.from_pretrained("simhaq-trmb/segformer-parker")
processor = SegformerImageProcessor.from_pretrained("simhaq-trmb/segformer-parker")

# Load image
image = cv2.imread("your_image.jpg")
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image_resized = cv2.resize(image_rgb, (512, 512))

# Process the image
inputs = processor(images=image_resized, return_tensors="pt")

# Perform inference
with torch.no_grad():
    outputs = model(pixel_values=inputs["pixel_values"])
    
    # Interpolate the logits to match the input resolution
    upsampled_logits = torch.nn.functional.interpolate(
        outputs.logits, 
        size=(512, 512),
        mode="bilinear",
        align_corners=False
    )
    
    # Get the segmentation mask
    segmentation_mask = upsampled_logits.argmax(dim=1).squeeze().cpu().numpy()

Training procedure

The model was trained with the following parameters:

Optimizer: AdamW with learning rate 5e-5
Loss function: Cross Entropy Loss
Batch size: 8
Number of epochs: 50
Best model was saved based on the lowest validation loss

Class Mapping

The model predicts 4 classes with the following mapping:

Class ID	Color	Description
0	Blue	Sky
1	Green	Dense Foliage
2	Black	Obstruction
3	Yellow	Sparse Foliage

Downloads last month: 117

Safetensors

Model size

84.6M params

Tensor type

F32

Model tree for simhaq-trmb/segformer-parker

Base model

nvidia/segformer-b5-finetuned-cityscapes-1024-1024

Finetuned

(6)

this model