Spaces:

chicelli
/

img2art-search

Runtime error

App Files Files Community

brunorosilva commited on Jul 2, 2024

Commit

3fcf4ec

1 Parent(s): 3343b23

docs: update readme

Browse files

Files changed (2) hide show

MIT-LICENSE.txt +20 -0
README.md +39 -16

MIT-LICENSE.txt ADDED Viewed

	@@ -0,0 +1,20 @@

+Copyright (c) 2024 Bruno Chicelli
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

README.md CHANGED Viewed

@@ -1,4 +1,4 @@
-# (WIP) MakeItSports Bot Image-to-Art Search
 This project fine-tunes a Vision Transformer (ViT) model, pre-trained with "google/vit-base-patch32-224-in21k" weights and fine tuned with the style of [ArtButMakeItSports](https://www.instagram.com/artbutmakeitsports/), to perform image-to-art search across 81k artworks made available by [WikiArt](https://wikiart.org/).
@@ -6,7 +6,7 @@ This project fine-tunes a Vision Transformer (ViT) model, pre-trained with "goog
 - [Overview](#overview)
 - [Installation](#installation)
-- [Usage](#usage)
 - [Dataset](#dataset)
 - [Training](#training)
 - [Inference](#inference)
@@ -44,30 +44,35 @@ This project leverages the Vision Transformer (ViT) model architecture for the t
 ### Training
-1. Fine-tune the ViT model:
-    ```sh
-    poetry run python main.py train --epochs 50 --batch_size 32
-    ```
 ### Inference via Gradio
-1. Perform image-to-art search using the fine-tuned model:
-    ```sh
-    poetry run python main.py interface
-    ```
 ### Create new gallery
-1. If you want to index new images to search, use:
-    ```sh
-    poetry run python main.py gallery --gallery_path <your_path>
-    ```
 ## Dataset
 The dataset derives from 1k images from the Instagram account [ArtButMakeItSports](https://www.instagram.com/artbutmakeitsports/). Images are downloaded and split into training, validation and test sets. Each image is paired with its corresponding artwork for training purposes, if you want this dataset just ask me stating your usage.
-WikiArt is indexed using the same process, except that there's no expected result. So each artwork is mapped to itself and the embeddings are saved as a numpy file (will be changed to chromadb in the future).
 ## Training
@@ -75,4 +80,22 @@ The training script fine-tunes the ViT model on the prepared dataset. Key steps
 1. Loading the pre-trained "google/vit-base-patch32-224-in21k" weights.
 2. Preparing the dataset and data loaders.
-3. Fine-tuning the model using a custom training loop.

+# Image-to-Art Search
 This project fine-tunes a Vision Transformer (ViT) model, pre-trained with "google/vit-base-patch32-224-in21k" weights and fine tuned with the style of [ArtButMakeItSports](https://www.instagram.com/artbutmakeitsports/), to perform image-to-art search across 81k artworks made available by [WikiArt](https://wikiart.org/).
 - [Overview](#overview)
 - [Installation](#installation)
+- [How it works](#how-it-works)
 - [Dataset](#dataset)
 - [Training](#training)
 - [Inference](#inference)
 ### Training
+Fine-tune the ViT model:
+```sh
+make train
+```
 ### Inference via Gradio
+Perform image-to-art search using the fine-tuned model:
+```sh
+make viz
+```
+### Recreate the wikiart gallery
+```sh
+make wikiart
+```
 ### Create new gallery
+If you want to index new images to search, use:
+```sh
+poetry run python main.py gallery --gallery_path <your_path>
+```
 ## Dataset
 The dataset derives from 1k images from the Instagram account [ArtButMakeItSports](https://www.instagram.com/artbutmakeitsports/). Images are downloaded and split into training, validation and test sets. Each image is paired with its corresponding artwork for training purposes, if you want this dataset just ask me stating your usage.
+WikiArt is indexed using the same process, except that there's no expected result. So each artwork is mapped to itself and the model is used as a feature extractor and the gallery embeddings are saved as a numpy file (will be changed to chromadb in the future).
 ## Training
 1. Loading the pre-trained "google/vit-base-patch32-224-in21k" weights.
 2. Preparing the dataset and data loaders.
+3. Fine-tuning the model using a custom training loop.
+4. Saving the model to the results folder
+## Interface
+The recommended method to get results is to use [gradio](https://www.gradio.app/) as an interface by running `make viz`. This will open a server and you can use some image you want to search or even use your webcam to get top 4 search results.
+### Examples
+## Contributing
+There are three topics I'd appreciate help with:
+1. Increasing the gallery by embedding new painting datasets, the current one has 81k artworks and I really want to up this number to a least 500k;
+2. Porting the encoding and search to a vector db, preferably chromadb;
+3. Open issues with how this could be improved. I'm not perfect and the code is very spaghetti right now.
+## License
+The source code for the site is licensed under the MIT license, which you can find in the MIT-LICENSE.txt file.
+All graphical assets are licensed under the Creative Commons Attribution 3.0 Unported License.