wanghaofan commited on
Commit
8eea91a
1 Parent(s): f2b7487

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +90 -6
README.md CHANGED
@@ -1,6 +1,90 @@
1
- ---
2
- license: other
3
- license_name: stabilityai-ai-community
4
- license_link: >-
5
- https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/LICENSE.md
6
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: stabilityai-ai-community
4
+ license_link: >-
5
+ https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/LICENSE.md
6
+ language:
7
+ - en
8
+ library_name: diffusers
9
+ pipeline_tag: text-to-image
10
+ tags:
11
+ - Text-to-Image
12
+ - IP-Adapter
13
+ - StableDiffusion3Pipeline
14
+ - image-generation
15
+ - Stable Diffusion
16
+ base_model:
17
+ - stabilityai/stable-diffusion-3.5-large
18
+ ---
19
+
20
+ # SD3.5-Large-IP-Adapter
21
+
22
+ This repository contains a IP-Adapter for SD3.5-Large model released by researchers from [InstantX Team](https://huggingface.co/InstantX), where image work just like text, so it may not be responsive or interfere with other text, but we do hope you enjoy this model, have fun and share your creative works with us [on Twitter](https://x.com/instantx_ai).
23
+
24
+ # Model Card
25
+ This is a regular IP-Adapter, where the new layers are added into all 38 blocks. We use [google/siglip-so400m-patch14-384](https://huggingface.co/google/siglip-so400m-patch14-384) to encode image for its superior performance, and adopt a TimeResampler to project. The image token number is set to 64.
26
+
27
+ # Showcases
28
+
29
+ <div class="container">
30
+ <img src="./teasers/0.png" width="1024"/>
31
+ <img src="./teasers/1.png" width="1024"/>
32
+ </div>
33
+
34
+ # Inference
35
+ The code has not been integrated into diffusers yet, please use our local files at this moment.
36
+ ```python
37
+ import torch
38
+ from PIL import Image
39
+
40
+ from models.transformer_sd3 import SD3Transformer2DModel
41
+ from pipeline_stable_diffusion_3_ipa import StableDiffusion3Pipeline
42
+
43
+ model_path = 'stabilityai/stable-diffusion-3.5-large'
44
+ ip_adapter_path = './ip-adapter.bin'
45
+ image_encoder_path = "google/siglip-so400m-patch14-384"
46
+
47
+ transformer = SD3Transformer2DModel.from_pretrained(
48
+ model_path, subfolder="transformer", torch_dtype=torch.bfloat16
49
+ )
50
+
51
+ pipe = StableDiffusion3Pipeline.from_pretrained(
52
+ model_path, transformer=transformer, torch_dtype=torch.bfloat16
53
+ ).to("cuda")
54
+
55
+ pipe.init_ipadapter(
56
+ ip_adapter_path=ip_adapter_path,
57
+ image_encoder_path=image_encoder_path,
58
+ nb_token=64,
59
+ )
60
+
61
+ ref_img = Image.open('./assets/1.jpg').convert('RGB')
62
+ image = pipe(
63
+ width=1024,
64
+ height=1024,
65
+ prompt='a cat',
66
+ negative_prompt="lowres, low quality, worst quality",
67
+ num_inference_steps=24,
68
+ guidance_scale=5.0,
69
+ generator=torch.Generator("cuda").manual_seed(42),
70
+ clip_image=ref_img,
71
+ ipadapter_scale=0.5,
72
+ ).images[0]
73
+ image.save('./result.jpg')
74
+ ```
75
+
76
+ # License
77
+ The model is released under [stabilityai-ai-community](https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/LICENSE.md). All copyright reserved.
78
+
79
+ # Acknowledgements
80
+ This project is sponsored by [HuggingFace](https://huggingface.co/) and [fal.ai](https://fal.ai/).
81
+
82
+ # Citation
83
+ If you find this project useful in your research, please cite us via
84
+ ```
85
+ @misc{sd35-large-ipa,
86
+ author = {InstantX Team},
87
+ title = {InstantX SD3.5-Large IP-Adapter Page},
88
+ year = {2024},
89
+ }
90
+ ```