vit-indian-food / README.md
therealcyberlord's picture
Update README.md
85ebd16 verified
---
license: apache-2.0
datasets:
- bharat-raghunathan/indian-foods-dataset
metrics:
- accuracy
- precision
- recall
---
# Indian Food Classification with Vision Transformer (ViT)
## Overview
This model is a fine-tuned Vision Transformer (ViT) for the task of classifying images of Indian foods. The model was trained on the [Indian Foods Dataset](https://huggingface.co/datasets/bharat-raghunathan/indian-foods-dataset) from Hugging Face Datasets.
## Dataset
The Indian Foods Dataset contains 4,770 images across 15 different classes of popular Indian dishes. The dataset is split into:
- Training: 3,047 images
- Validation: 762 images
- Testing: 961 images
## Model
The base model used is the vision transformer (google/vit-base-patch16-224-in21k). The model was fine-tuned on the Indian Foods Dataset for 10 epochs using the AdamW optimizer with a learning rate of 2e-4.
## Evaluation
The model was evaluated on the test set and achieved the following metrics:
- Accuracy: 0.9667
- Precision: 0.9670
- Recall: 0.9667
## Usage
You can use this pre-trained model directly from Hugging Face