Model Card for Model ID

This U-Net model classifies each pixel in an rtMRI Video into air or tissue, and we get the Air-Tissue Boundaries.

Model Description

he model uses a U-Net architecture with three decoder branches, each consisting of convolutional and upsampling layers. The encoder consists of convolutional and downsampling layers, followed by a bottleneck layer. The three decoder branches share the same encoder and bottleneck layers, but have different upsampling and convolutional layers. Each decoder branch produces a mask for a different class, with the final output being a 3D tensor with shape (batch_size, height, width, n_labels).

Developed by: Vinayaka Hegde , during my internship at Signal Processing Interpretation and Representation (SPIRE) , Lab , at the Indian Institute of Science, Bengaluru
Model type: U-Net
Language(s) (NLP): N/A
License: Apache 2.0
Finetuned from model N/A

Model Sources [optional]

Repository: vinster619/UNet_USC_TIMIT
Paper [optional]:
Demo [optional]:

Uses

This pre-trained U-Net model was trained on a dataset comprising videos 342 and 391 from each speaker present in the 10-speaker USC-TIMIT Corpus (Total 20 Videos). The model is designed to classify each pixel in an rtMRI video as either air or tissue. Three distinct masks were used to train the model.

Direct Use

3 Segmented binary masks , and their corresponding "contours" can be accurately segmented for any rtMRI video within the USC-TIMIT Corpus.

Downstream Use [optional]

This model can be fine-tuned to work properly on other subjects of otehr rtMRI Datasets by finetuning using aprroximately 10-15 frames of any new subject the segmentation has to be performed on.

Out-of-Scope Use

The model will accurately perform segmentation ONLY on videos from the USC-TIMIT Corpus. To accurately perform segmentation on videos if subjects from other rtMRI datasets, fine-tuning using frames from the new subject is required.

How to Get Started with the Model

Please run the inference.py code , to acces the uploaded weights on this repository and obtain an output video file with the segmented Air-Tissue boundaries.

Training Details

Data:USC-TIMIT Corpus (https://sail.usc.edu/span/usc-timit/) Training set size: 2 Videos per subject from each of the 10 subjects present in the dataset Validation set size: 1 Video per subject from each of the 10 subjects present in the dataset Model Architecture: Optimizer: Adam Loss Function: Binary Crossentropy Epochs: 30 , EarlyStopping used Batch Size: 8 Evaluation Metrics: Pixel Classification Accuracy, Dice Coefficient Validation Split: Specify the proportion of the data used for validation (based on the split between train_matrix and val_matrix) Hardware: NVIDIA GeForce RTX 4060 Laptop GPU

Model Card Authors

Vinayaka Hegde

Model Card Contact

vinayakahegde619@gmail.com