bigcatvit / README.md
therealcyberlord's picture
Update README.md
eb30004
|
raw
history blame
No virus
749 Bytes
metadata
license: apache-2.0

Fine-tuning a Vision Transformer on the Big Cats Dataset In this project, we fine-tuned a vision transformer on the Big Cats dataset to perform image classification. The Big Cats dataset consists of 2339 images of 10 different types of big cats, including lions, tigers, jaguars, and more.

Our goal was to train a model that could accurately classify these images with high accuracy. After fine-tuning a pre-trained Vision Transformer, we were able to achieve an accuracy of 98%.

References [1] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... & Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.