therealcyberlord
/

vit-indian-food

Image Classification

Inference Endpoints

Model card Files Files and versions Community

vit-indian-food / README.md

therealcyberlord's picture

therealcyberlord

Update README.md

85ebd16 verified 7 months ago

|

history blame contribute delete

1.1 kB

	---
	license: apache-2.0
	datasets:
	- bharat-raghunathan/indian-foods-dataset
	metrics:
	- accuracy
	- precision
	- recall
	---

	# Indian Food Classification with Vision Transformer (ViT)

	## Overview
	This model is a fine-tuned Vision Transformer (ViT) for the task of classifying images of Indian foods. The model was trained on the [Indian Foods Dataset](https://huggingface.co/datasets/bharat-raghunathan/indian-foods-dataset) from Hugging Face Datasets.

	## Dataset
	The Indian Foods Dataset contains 4,770 images across 15 different classes of popular Indian dishes. The dataset is split into:

	- Training: 3,047 images
	- Validation: 762 images
	- Testing: 961 images

	## Model
	The base model used is the vision transformer (google/vit-base-patch16-224-in21k). The model was fine-tuned on the Indian Foods Dataset for 10 epochs using the AdamW optimizer with a learning rate of 2e-4.

	## Evaluation
	The model was evaluated on the test set and achieved the following metrics:

	- Accuracy: 0.9667
	- Precision: 0.9670
	- Recall: 0.9667

	## Usage
	You can use this pre-trained model directly from Hugging Face