Model Card for Model ID

This model is trained on the Food101 dataset and is designed to classify images of food into 101 different categories. It uses a ResNet-50 architecture and achieves high accuracy, making it suitable for image classification tasks in the food domain.

Model Details

Model Description

This model is a ResNet-50-based deep learning model, trained on the Food101 dataset. It classifies food images into 101 categories, such as Apple Pie, Cheesecake, Pizza, and others, using a convolutional neural network architecture that has been fine-tuned for optimal performance.

  • Developed by: Vinny
  • Model type: Image Classification (CNN - ResNet-50)
  • License: MIT
  • Finetuned from model: ResNet-50 pretrained on ImageNet

Uses

Direct Use

The model can be used for food image classification. Given an input image, it predicts one of 101 categories of food.

Out-of-Scope Use

This model is trained specifically for food images and may not perform well on images outside of this domain. It should not be used for classifying non-food-related content. Malicious or unintended use such as misuse for surveillance should also be avoided.

Bias, Risks, and Limitations

Since the model is trained on the Food101 dataset, its performance is limited to the classes within that dataset. If the food categories are skewed (e.g., over-representation of certain cuisines), this could affect performance on underrepresented categories.

Risks

  • Class Imbalance: Some food categories in the training data may have more examples than others, which could lead to biased predictions for overrepresented classes.
  • Misclassifications: The model might misclassify food items that look similar to each other or if the image quality is poor.
  • Domain-specific Limitations: This model is not generalized to recognize objects outside of food items.

Recommendations

When using this model, users should ensure that the input images are of good quality and that the food items fall within the 101 categories of the Food101 dataset.

Further fine-tuning or retraining on more diverse datasets may be required if you want to improve classification for underrepresented classes.

How to Get Started with the Model

Use the code below to get started with the model.

Use the following command to install the required dependencies:

pip install torch torchvision

Example Code

Below is an example of how to load and use the model for food image classification:

import torch
import torchvision.models as models
from torchvision import transforms
from PIL import Image
import matplotlib.pyplot as plt

# Load the model
model = models.resnet50(pretrained=False, num_classes=101)
model.load_state_dict(torch.load('food101_model.pth', map_location=torch.device('cpu')))
model.eval()

# Define preprocessing transformations
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

# Load and preprocess the image
img = Image.open('your_image.jpg')  # replace with the path to your image
img_tensor = transform(img).unsqueeze(0)

# Make a prediction
with torch.no_grad():
    output = model(img_tensor)
    _, predicted = torch.max(output, 1)

# Map the predicted index to the food class label
food_classes = [
    'apple_pie', 'baby_back_ribs', 'baklava', 'beef_carpaccio', 'beef_tartare', 'beet_salad', 'beignets',
    'bibimbap', 'bread_pudding', 'breakfast_burrito', 'bruschetta', 'caesar_salad', 'cannoli', 'caprese_salad',
    'carrot_cake', 'ceviche', 'cheesecake', 'cheese_plate', 'chicken_curry', 'chicken_quesadilla', 'chicken_wings',
    'chocolate_cake', 'chocolate_mousse', 'churros', 'clam_chowder', 'club_sandwich', 'crab_cakes', 'creme_brulee',
    'croque_madame', 'cup_cakes', 'deviled_eggs', 'donuts', 'dumplings', 'edamame', 'eggs_benedict', 'escargots',
    'falafel', 'filet_mignon', 'fish_and_chips', 'foie_gras', 'french_fries', 'french_onion_soup', 'french_toast',
    'fried_calamari', 'fried_rice', 'frozen_yogurt', 'garlic_bread', 'gnocchi', 'greek_salad', 'grilled_cheese_sandwich',
    'grilled_salmon', 'guacamole', 'gyoza', 'hamburger', 'hot_and_sour_soup', 'hot_dog', 'huevos_rancheros', 'hummus',
    'ice_cream', 'lasagna', 'lobster_bisque', 'lobster_roll_sandwich', 'macaroni_and_cheese', 'macarons', 'miso_soup',
    'mussels', 'nachos', 'omelette', 'onion_rings', 'oysters', 'pad_thai', 'paella', 'pancakes', 'panna_cotta',
    'peking_duck', 'pho', 'pizza', 'pork_chop', 'poutine', 'prime_rib', 'pulled_pork_sandwich', 'ramen', 'ravioli',
    'red_velvet_cake', 'risotto', 'samosa', 'sashimi', 'scallops', 'seaweed_salad', 'shrimp_and_grits', 'spaghetti_bolognese',
    'spaghetti_carbonara', 'spring_rolls', 'steak', 'strawberry_shortcake', 'sushi', 'tacos', 'takoyaki', 'tiramisu',
    'tuna_tartare', 'waffles'
]  # 101 food categories
predicted_class = food_classes[predicted.item()]

# Show the image and the prediction
plt.imshow(img)
plt.title(f"Predicted: {predicted_class}")
plt.axis('off')
plt.show()

Training Details

Training Data

The model was trained on the Food101 dataset, which contains 101,000 images across 101 food categories. Each category contains 1,000 images of various food types.

Dataset: Food101 Dataset

Training Procedure

The model was trained using a ResNet-50 architecture. The following preprocessing steps were applied to the images:

Preprocessing [optional]

Resize images to 224x224 pixels.

Normalize pixel values to match the ImageNet statistics.

Convert images to tensors.

Training Hyperparameters

  • Epochs: 10
  • Batch size: 32
  • Learning rate: 0.001
  • Optimizer: Adam

Evaluation

Testing Data, Factors & Metrics

Metrics

The model's performance was evaluated based on accuracy, with a final achieved accuracy of 95.03% after 10 epochs.

Results

Final Accuracy: 95.03%

The model shows excellent performance in classifying food images across 101 categories.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

BibTeX:

@misc{food101model,
  author = {VinnyVortex},
  title = {Food101 Image Classification Model},
  year = {2025},
  publisher = {Hugging Face}
}

Glossary

ResNet-50: A convolutional neural network architecture that uses residual connections.

Food101: A dataset containing 101 food categories with 1,000 images each.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train VinnyVortex004/Food101-Classifier