cloning multi-view-diffusion repository

Files changed (4) hide show

.gitignore ADDED Viewed

+*.pt
+*.yaml
+**/__pycache__
+*.pyc
+venv/

README.md ADDED Viewed

+---
+license: openrail
+pipeline_tag: image-to-3d
+---
+This is a copy of [ashawkey/imagedream-ipmv-diffusers](https://huggingface.co/ashawkey/imagedream-ipmv-diffusers).
+It is hosted here for persistence throughout the ML for 3D course.
+# MVDream-diffusers Model Card
+This is a port of https://huggingface.co/Peng-Wang/ImageDream into diffusers.
+For usage, please check: https://github.com/ashawkey/mvdream_diffusers
+## Citation
+```
+@article{wang2023imagedream,
+  title={ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation},
+  author={Wang, Peng and Shi, Yichun},
+  journal={arXiv preprint arXiv:2312.02201},
+  year={2023}
+}
+```
+## Misuse, Malicious Use, and Out-of-Scope Use
+The model should not be used to intentionally create or disseminate images that create hostile or alienating environments for people. This includes generating images that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotypes.

feature_extractor/preprocessor_config.json ADDED Viewed

+{
+  "crop_size": {
+    "height": 224,
+    "width": 224
+  },
+  "do_center_crop": true,
+  "do_convert_rgb": true,
+  "do_normalize": true,
+  "do_rescale": true,
+  "do_resize": true,
+  "feature_extractor_type": "CLIPFeatureExtractor",
+  "image_mean": [
+    0.48145466,
+    0.4578275,
+    0.40821073
+  ],
+  "image_processor_type": "CLIPImageProcessor",
+  "image_std": [
+    0.26862954,
+    0.26130258,
+    0.27577711
+  ],
+  "resample": 3,
+  "rescale_factor": 0.00392156862745098,
+  "size": {
+    "shortest_edge": 224
+  },
+  "use_square_size": false
+}

image_encoder/config.json ADDED Viewed

+{
+  "_name_or_path": "laion/CLIP-ViT-H-14-laion2B-s32B-b79K",
+  "architectures": [
+    "CLIPVisionModel"
+  ],
+  "attention_dropout": 0.0,
+  "dropout": 0.0,
+  "hidden_act": "gelu",
+  "hidden_size": 1280,
+  "image_size": 224,
+  "initializer_factor": 1.0,
+  "initializer_range": 0.02,
+  "intermediate_size": 5120,
+  "layer_norm_eps": 1e-05,
+  "model_type": "clip_vision_model",
+  "num_attention_heads": 16,
+  "num_channels": 3,
+  "num_hidden_layers": 32,
+  "patch_size": 14,
+  "projection_dim": 1024,
+  "torch_dtype": "float16",
+  "transformers_version": "4.35.2"
+}