KameliaZaman/Butterfly-Classification-Using-CNN

Butterfly Classification using CNN

Butterfly image classification using ResNet50V2
View Demo

Table of Contents

About The Project
- Built With
Getting Started
- Dependencies
- Installation
Usage
Contributing
License
Contact

About The Project

The project aims to develop a butterfly image classification system utilizing the ResNet50V2 architecture. The goal is to accurately identify different species of butterflies from images, leveraging the deep learning capabilities of ResNet50V2. This involves training the model on a large dataset of butterfly images, fine-tuning its parameters, and optimizing its performance to achieve high accuracy in classifying various butterfly species. Ultimately, the project seeks to provide a reliable tool for researchers, conservationists, and enthusiasts to easily identify and catalog different butterfly species, aiding in biodiversity studies and conservation efforts.

(back to top)

Built With

(back to top)

Getting Started

Please follow these simple steps to setup this project locally.

Dependencies

Here are the list all libraries, packages and other dependencies that need to be installed to run this project.

For example, this is how you would list them:

TensorFlow 2.16.1

conda install -c conda-forge tensorflow

OpenCV 4.9.0
```
conda install -c conda-forge opencv
```
Gradio 4.24.0
```
conda install -c conda-forge gradio
```
NumPy 1.26.4
```
conda install -c conda-forge numpy
```

Alternative: Export Environment

Alternatively, clone the project repository, install it and have all dependencies needed.

conda env export > requirements.txt

Recreate it using:

conda env create -f requirements.txt

Installation

# clone project   
git clone https://huggingface.co/spaces/KameliaZaman/Butterfly-Classification-using-CNN/tree/main

# go inside the project directory 
cd Butterfly-Classification-using-CNN

# install the required packages
pip install -r requirements.txt

# run the gradio app
python app.py

(back to top)

Usage

Dataset

Dataset is from "https://www.kaggle.com/datasets/gpiosenka/butterfly-images40-species" which contains train, test and validation sets for 100 butterfly or moth species.

Model Architecture

ResNet50V2 was used to to train the model. Adam optimizer was applied with a learning rate of 0.0001.

Data Preparation

The dataset is loaded from a CSV file containing information about the butterflies and moths.
Image paths are constructed based on the dataset information.
The dataset is split into training, validation, and test sets.

Exploratory Data Analysis (EDA)

Visualizations are created to explore the distribution of labels in the dataset.

label_counts = df['labels'].value_counts()[:10]

fig = px.bar(x=label_counts.index, 
             y=label_counts.values,
             color=label_counts.values,
             text=label_counts.values,
             color_continuous_scale='Blues')

fig.update_layout(
    title_text='Top 10 Labels Distribution',
    template='plotly_white',
    xaxis=dict(
        title='Label',
    ),
    yaxis=dict(
        title='Count',
    )
)

fig.update_traces(marker_line_color='black', 
                  marker_line_width=1.5, 
                  opacity=0.8)
 
fig.show()

Image Data Generation

Image data generators are used to augment the training data.

Training and validation data generators are created.

train_gen = ImageDataGenerator(horizontal_flip=True, vertical_flip=True, rescale=1/255.)
val_gen = ImageDataGenerator(rescale=1/255.)

BATCH_SIZE = 64
SEED = 56
IMAGE_SIZE = (244, 244)

train_flow_gen = train_gen.flow_from_directory(directory=train_dir,
                                              class_mode='sparse',
                                              batch_size=BATCH_SIZE,
                                              target_size=IMAGE_SIZE,
                                              seed=SEED)

val_flow_gen = val_gen.flow_from_directory(directory=val_dir,
                                            class_mode='sparse',
                                            batch_size=BATCH_SIZE,
                                            target_size=IMAGE_SIZE,
                                            seed=SEED)

Model Training

The ResNet50V2-based model is constructed and compiled.
The model is trained on the augmented training data, and its performance is monitored using validation data.

Callbacks for reducing learning rate and early stopping are employed during training.

resnet_model.fit(train_flow_gen, epochs=15,
       steps_per_epoch=int(np.ceil(train_df.shape[0]/BATCH_SIZE)),
       validation_data=val_flow_gen,
       validation_steps=int(np.ceil(val_df.shape[0]/BATCH_SIZE)),
       callbacks=[rlr_cb, early_cb])

Model Evaluation

The trained model is evaluated on the test set to measure its accuracy.

Deployment

Gradio is utilized for deploying the trained model.

Users can input an image, and the model will predict the butterfly species.

import gradio as gr
import tensorflow as tf
from tensorflow.keras.models import load_model
import numpy as np
import cv2

model_path = './model_checkpoint_manual_resnet.h5'
model = load_model(model_path)

class_names = ['ADONIS', 'AFRICAN GIANT SWALLOWTAIL', 'AMERICAN SNOOT', 'AN 88', 'APPOLLO', 'ARCIGERA FLOWER MOTH', 'ATALA', 'ATLAS MOTH', 'BANDED ORANGE HELICONIAN', 'BANDED PEACOCK']

def preprocess_image(img):
    if isinstance(img, str):
        # Load and preprocess the image
        img = cv2.imread(img)
        img = cv2.resize(img, (224, 224))
        img = img / 255.0  # Normalize pixel values
        img = np.expand_dims(img, axis=0)  # Add batch dimension
    elif isinstance(img, np.ndarray):
        img = cv2.resize(img, (224, 224))
        img = img / 255.0  # Normalize pixel values
        img = np.expand_dims(img, axis=0)  # Add batch dimension
    else:
        raise ValueError("Unsupported input type. Please provide a file path or a NumPy array.")  
    return img

def classify_image(img):
    img = preprocess_image(img)
    predictions = model.predict(img)
    predicted_class = np.argmax(predictions)
    predicted_class_name = class_names[predicted_class]
    
    return f"Predicted Class: {predicted_class_name}"

iface = gr.Interface(fn=classify_image, 
                     inputs="image",
                     outputs="text",
                     live=True)

iface.launch()

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

(back to top)

License

Distributed under the MIT License. See MIT License for more information.

(back to top)

Contact

Kamelia Zaman Moon - kamelia.stu2017@juniv.edu

Project Link: https://huggingface.co/spaces/KameliaZaman/Butterfly-Classification-using-CNN

(back to top)