strollingorange
/

roomLuxuryAnnotater

Safetensors

clip

Generated from Trainer

Model card Files Files and versions Community

strollingorange commited on Oct 7, 2024

Commit

35c406f

verified ·

1 Parent(s): 0a7edc8

Update README.md

Browse files

Files changed (1) hide show

README.md +68 -22

README.md CHANGED Viewed

@@ -10,42 +10,88 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # laion-finetuned_room luxury annotater
-This model is a fine-tuned version of [laion/CLIP-ViT-B-32-laion2B-s34B-b79K](https://huggingface.co/laion/CLIP-ViT-B-32-laion2B-s34B-b79K) on an Wahi private dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 1e-06
-- train_batch_size: 384
-- eval_batch_size: 24
-- seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-### Training results
-### Framework versions
-- Transformers 4.37.2
-- Pytorch 2.0.1+cu117
-- Datasets 2.14.4
-- Tokenizers 0.15.0

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # laion-finetuned_room luxury annotater
+This model is a fine-tuned version of [laion/CLIP-ViT-B-32-laion2B-s34B-b79K](https://huggingface.co/laion/CLIP-ViT-B-32-laion2B-s34B-b79K) on a private dataset provided by Wahi Inc. It is designed to classify room images into categories based on their luxury level and room type.
+## Model Description
+This model leverages a fine-tuned version of CLIP, specifically optimized for real estate image annotation. It performs zero-shot classification of room images into categories like standard or contemporary kitchens, bathrooms, and other common rooms in real estate properties. The model uses a multi-stage approach where diffusion models generate supplementary training data, and hierarchical CLIP networks perform luxury annotation. This fine-tuning process enables high accuracy in distinguishing luxury levels from real estate images.
+The model was developed for the paper *"Diffusion-based Data Augmentation and Hierarchical CLIP for Real Estate Image Annotation"* submitted to the *Pattern Analysis and Applications Special Issue on Multimedia Sensing and Computing*.
+![Model Framework](framework.png)
+## Intended Uses & Limitations
+This model is intended to be used for:
+- Annotating real estate images by classifying room types and luxury levels (e.g., standard or contemporary kitchens, bathrooms, etc.).
+- Helping users filter properties in real estate platforms based on the luxury level of rooms.
+**Limitations**:
+- The model is optimized for real estate images and may not generalize well to other domains.
+- Zero-shot classification is limited to the predefined categories and candidate labels used during fine-tuning.
+## Training and Evaluation Data
+The training data was collected and labeled by Wahi Inc. and includes a diverse set of real estate images from kitchens, bathrooms, dining rooms, living rooms, and foyers. The images were annotated as either standard or contemporary, based on the room's aesthetics, design, and quality.
+## Training Procedure
+### Training Hyperparameters
 The following hyperparameters were used during training:
+- **Learning Rate**: 1e-06
+- **Train Batch Size**: 384
+- **Eval Batch Size**: 24
+- **Seed**: 42
+- **Optimizer**: Adam with betas=(0.9, 0.999) and epsilon=1e-08
+- **LR Scheduler Type**: Linear
+### Framework Versions
+- **Transformers**: 4.37.2
+- **PyTorch**: 2.0.1+cu117
+- **Datasets**: 2.14.4
+- **Tokenizers**: 0.15.0
+### Output Example
+Below is an example of the model's output, where an image of a kitchen is classified with its top 3 predicted room types and confidence scores.
+![Model Output Example](example_output.png)  <!-- You would replace this with the actual image of the model's output. -->
+## How to Use the Model
+You can use this model for zero-shot image classification with the HuggingFace `pipeline` API. Here is a basic example:
+```python
+from transformers import pipeline
+# Initialize the pipeline
+classifier = pipeline("zero-shot-image-classification", model="strollingorange/roomLuxuryAnnotater")
+# Define the candidate labels
+candidate_labels = [
+    "a photo of standard bathroom",
+    "a photo of contemporary bathroom",
+    "a photo of standard kitchen",
+    "a photo of contemporary kitchen",
+    "a photo of standard foyer",
+    "a photo of standard living room",
+    "a photo of standard dining room",
+    "a photo of contemporary foyer",
+    "a photo of contemporary living room",
+    "a photo of contemporary dining room"
+]
+# Load your image (replace 'image_path' with your actual image path)
+image = Image.open('path_to_your_image.jpg')
+# Run zero-shot classification
+result = classifier(image, candidate_labels=candidate_labels)
+# Output the result
+print(result)
+# Acknowledgments
+We would like to acknowledge Wahi Inc. for providing the training data and their continued support in the development of this model. Their collaboration was essential in fine-tuning the model for real estate image annotation.