CNN with CIFAR-10
A PyTorch implementation of a Convolutional Neural Network (CNN) for image classification on the CIFAR-10 dataset, achieving 81.45% test accuracy.
Try it out! https://huggingface.co/spaces/Rahuletto/CNN
Architecture
The CNN model consists of
Convolutional Layers:
- Conv1: 3 β 32 channels, 3x3 kernel, padding=1
- Conv2: 32 β 64 channels, 3x3 kernel, padding=1
- Conv3: 64 β 128 channels, 3x3 kernel, padding=1
Others
- Batch Normalization after each convolutional layer
- MaxPooling2D (2x2) for downsampling
- ReLU activation functions
- Fully Connected Layers: 2048 β 512 β 10
- Dropout (50%) for regularization
Getting Started
Prerequisites
- Python 3.12+
- PyTorch 2.7.1+
- torchvision 0.22.1+
This project was developed with
uv
, so it is best to useuv
for project management.
Installation
Clone the repository:
git clone https://github.com/rahuletto/cnn cd CNN
Create virtual environment:
python -m venv .venv source .venv/bin/activate # Windows: .venv\Scripts\activate
Install dependencies:
pip install -r requirements.txt
Running the model
- Run the main file
python main.py
You can play around in the gradio interface
https://github.com/user-attachments/assets/1f742c32-79bd-4d16-a74f-68c241f4a841
Model Code
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.conv1 = nn.Conv2d(3, 32, 3, stride=1, padding=1) # 32x32 -> 16x16
self.bn1 = nn.BatchNorm2d(32)
self.conv2 = nn.Conv2d(32, 64, 3, stride=1, padding=1) # 16x16 -> 8x8
self.bn2 = nn.BatchNorm2d(64)
self.conv3 = nn.Conv2d(64, 128, 3, stride=1, padding=1) # 8x8 -> 4x4
self.bn3 = nn.BatchNorm2d(128)
self.pool = nn.MaxPool2d(stride=2, kernel_size=2)
self.fc1 = nn.Linear(128 * 4 * 4, 512)
self.fc2 = nn.Linear(512, 10)
self.dropout = nn.Dropout(0.5)
def forward(self, x):
x = self.pool(F.relu(self.bn1(self.conv1(x))))
x = self.pool(F.relu(self.bn2(self.conv2(x))))
x = self.pool(F.relu(self.bn3(self.conv3(x))))
x = x.view(x.size(0), -1)
x = self.dropout(x)
x = F.relu(self.fc1(x))
x = self.dropout(x)
x = self.fc2(x)
return x
Training Configuration
- Optimizer: Adam (lr=0.001)
- Batch Size: 64
- Epochs: 50
Best model checkpoint was saved at epoch 49 with validation loss of 0.6553.
Model
There are two CNN models in cnn/
folder
model.pt
model-old.pt
model.pt
was trained with BatchNorm2d
to reach 81.45% accuracy in CIFAR-10 dataset
model-old.pt
was trained without fine tuning which gets 75% accuracy in CIFAR-10 dataset
Accuracy:
Total Accuracy: 81.45%
- Airplane:
84.60%
- Automobile:
93.20%
- Bird:
76.90%
- Cat:
69.70%
- Deer:
77.20%
- Dog:
64.00%
- Frog:
89.30%
- Horse:
82.10%
- Ship:
89.60%
- Truck:
87.90%
References
- CIFAR-10 Dataset
- PyTorch Documentation
- Convolutional Neural Networks for Visual Recognition (CS231n)
- Deep Learning Book - Ian Goodfellow
License
This project is licensed under the MIT License - see the LICENSE file for details.