--- license: apache-2.0 datasets: - fashion_mnist language: - en metrics: - accuracy pipeline_tag: image-classification --- # Fashion-MNIST Baseline Classifier ## Model Details - **Model Name:** fashion-mnist-base - **Framework:** Custom implementation in Python - **Version:** 0.1 - **License:** Apache-2.0 ## Model Description This is a neural network model developed from the ground up to classify images from the Fashion-MNIST dataset. The dataset comprises 70,000 grayscale images across 10 categories. Each example is a 28x28 grayscale image, associated with a label from 10 classes including T-shirts/tops, trousers, pullovers, dresses, coats, sandals, shirts, sneakers, bags, and ankle boots. ## Intended Use This model is intended for educational purposes and as a baseline for more complex implementations. It can be used by students and AI enthusiasts to understand the workings of neural networks and their application in image classification. ## Training Data The model was trained on the Fashion-MNIST dataset, which contains 60,000 training images and 10,000 test images. Each image is 28x28 pixels, grayscale, associated with one of 10 classes representing different types of clothing and accessories. ### Architecture Details: - Input layer: 784 neurons (flattened 28x28 image) - Hidden layer 1: 256 neurons, ReLU activation, Dropout - Hidden layer 2: 64 neurons, ReLU activation, Dropout - Output layer: 10 neurons, logits ### Hyperparameters: - Learning rate: 0.005 - Batch size: 32 - Epochs: 25 The model uses a self-implemented stochastic gradient descent (SGD) optimizer. ## Evaluation Results The model achieved the following performance on the test set: - Accuracy: 86.7% - Precision, Recall, and F1-Score: | Label | Precision | Recall | F1-score | |-------------|-----------|---------|----------| | T-shirt/Top | 0.847514 | 0.767 | 0.805249 | | Trouser | 0.982618 | 0.961 | 0.971689 | | Pullover | 0.800000 | 0.748 | 0.773127 | | Dress | 0.861868 | 0.886 | 0.873767 | | Coat | 0.776278 | 0.805 | 0.790378 | | Sandal | 0.957958 | 0.957 | 0.957479 | | Shirt | 0.638587 | 0.705 | 0.670152 | | Sneaker | 0.935743 | 0.932 | 0.933868 | | Bag | 0.952381 | 0.960 | 0.956175 | | Ankle-Boot | 0.944554 | 0.954 | 0.949254 | ## Limitations and Biases Due to the nature of the training dataset, the model may not capture the full complexity of fashion items in diverse real-world scenarios. In practice, we found out that it is sensitive to background colors and article's proportions. ## How to Use ```python import torch import torchvision.transforms as transforms from PIL import Image model = torch.load('fashion-mnist-base.pt') # Images need to be transformed to the `fashion MNIST` dataset format transform = transforms.Compose( [ transforms.Resize((28, 28)), transforms.Grayscale(), transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,)), # Normalization transforms.Lambda(lambda x: 1.0 - x), # Invert colors transforms.Lambda(lambda x: x[0]), transforms.Lambda(lambda x: x.unsqueeze(0)), ] ) img = Image.open('fashion/dress.png') img = transform(img) model.predictions(img) ``` ## Sample Output ``` {'Dress': 84.437744, 'Coat': 7.631796, 'Pullover': 4.2272186, 'Shirt': 1.297625, 'T-shirt/Top': 1.2237197, 'Bag': 0.9053432, 'Trouser/Jeans': 0.27268794, 'Sneaker': 0.0031491981, 'Ankle-Boot': 0.00063403655, 'Sandal': 8.5103806e-05} ```