strollingorange commited on
Commit
35c406f
·
verified ·
1 Parent(s): 0a7edc8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -22
README.md CHANGED
@@ -10,42 +10,88 @@ model-index:
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
  should probably proofread and complete it, then remove this comment. -->
13
-
14
  # laion-finetuned_room luxury annotater
15
 
16
- This model is a fine-tuned version of [laion/CLIP-ViT-B-32-laion2B-s34B-b79K](https://huggingface.co/laion/CLIP-ViT-B-32-laion2B-s34B-b79K) on an Wahi private dataset.
 
 
 
 
17
 
18
- ## Model description
19
 
20
- More information needed
 
21
 
22
- ## Intended uses & limitations
 
 
23
 
24
- More information needed
 
 
25
 
26
- ## Training and evaluation data
27
 
28
- More information needed
29
 
30
- ## Training procedure
31
 
32
- ### Training hyperparameters
33
 
34
  The following hyperparameters were used during training:
35
- - learning_rate: 1e-06
36
- - train_batch_size: 384
37
- - eval_batch_size: 24
38
- - seed: 42
39
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
40
- - lr_scheduler_type: linear
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
- ### Training results
 
 
 
 
 
 
 
 
 
 
 
 
43
 
 
 
44
 
 
 
45
 
46
- ### Framework versions
 
47
 
48
- - Transformers 4.37.2
49
- - Pytorch 2.0.1+cu117
50
- - Datasets 2.14.4
51
- - Tokenizers 0.15.0
 
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
  should probably proofread and complete it, then remove this comment. -->
 
13
  # laion-finetuned_room luxury annotater
14
 
15
+ This model is a fine-tuned version of [laion/CLIP-ViT-B-32-laion2B-s34B-b79K](https://huggingface.co/laion/CLIP-ViT-B-32-laion2B-s34B-b79K) on a private dataset provided by Wahi Inc. It is designed to classify room images into categories based on their luxury level and room type.
16
+
17
+ ## Model Description
18
+
19
+ This model leverages a fine-tuned version of CLIP, specifically optimized for real estate image annotation. It performs zero-shot classification of room images into categories like standard or contemporary kitchens, bathrooms, and other common rooms in real estate properties. The model uses a multi-stage approach where diffusion models generate supplementary training data, and hierarchical CLIP networks perform luxury annotation. This fine-tuning process enables high accuracy in distinguishing luxury levels from real estate images.
20
 
21
+ The model was developed for the paper *"Diffusion-based Data Augmentation and Hierarchical CLIP for Real Estate Image Annotation"* submitted to the *Pattern Analysis and Applications Special Issue on Multimedia Sensing and Computing*.
22
 
23
+ ![Model Framework](framework.png)
24
+ ## Intended Uses & Limitations
25
 
26
+ This model is intended to be used for:
27
+ - Annotating real estate images by classifying room types and luxury levels (e.g., standard or contemporary kitchens, bathrooms, etc.).
28
+ - Helping users filter properties in real estate platforms based on the luxury level of rooms.
29
 
30
+ **Limitations**:
31
+ - The model is optimized for real estate images and may not generalize well to other domains.
32
+ - Zero-shot classification is limited to the predefined categories and candidate labels used during fine-tuning.
33
 
34
+ ## Training and Evaluation Data
35
 
36
+ The training data was collected and labeled by Wahi Inc. and includes a diverse set of real estate images from kitchens, bathrooms, dining rooms, living rooms, and foyers. The images were annotated as either standard or contemporary, based on the room's aesthetics, design, and quality.
37
 
38
+ ## Training Procedure
39
 
40
+ ### Training Hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
+ - **Learning Rate**: 1e-06
44
+ - **Train Batch Size**: 384
45
+ - **Eval Batch Size**: 24
46
+ - **Seed**: 42
47
+ - **Optimizer**: Adam with betas=(0.9, 0.999) and epsilon=1e-08
48
+ - **LR Scheduler Type**: Linear
49
+
50
+ ### Framework Versions
51
+
52
+ - **Transformers**: 4.37.2
53
+ - **PyTorch**: 2.0.1+cu117
54
+ - **Datasets**: 2.14.4
55
+ - **Tokenizers**: 0.15.0
56
+
57
+ ### Output Example
58
+
59
+ Below is an example of the model's output, where an image of a kitchen is classified with its top 3 predicted room types and confidence scores.
60
+
61
+ ![Model Output Example](example_output.png) <!-- You would replace this with the actual image of the model's output. -->
62
+
63
+ ## How to Use the Model
64
+
65
+ You can use this model for zero-shot image classification with the HuggingFace `pipeline` API. Here is a basic example:
66
+
67
+ ```python
68
+ from transformers import pipeline
69
+
70
+ # Initialize the pipeline
71
+ classifier = pipeline("zero-shot-image-classification", model="strollingorange/roomLuxuryAnnotater")
72
 
73
+ # Define the candidate labels
74
+ candidate_labels = [
75
+ "a photo of standard bathroom",
76
+ "a photo of contemporary bathroom",
77
+ "a photo of standard kitchen",
78
+ "a photo of contemporary kitchen",
79
+ "a photo of standard foyer",
80
+ "a photo of standard living room",
81
+ "a photo of standard dining room",
82
+ "a photo of contemporary foyer",
83
+ "a photo of contemporary living room",
84
+ "a photo of contemporary dining room"
85
+ ]
86
 
87
+ # Load your image (replace 'image_path' with your actual image path)
88
+ image = Image.open('path_to_your_image.jpg')
89
 
90
+ # Run zero-shot classification
91
+ result = classifier(image, candidate_labels=candidate_labels)
92
 
93
+ # Output the result
94
+ print(result)
95
 
96
+ # Acknowledgments
97
+ We would like to acknowledge Wahi Inc. for providing the training data and their continued support in the development of this model. Their collaboration was essential in fine-tuning the model for real estate image annotation.