aldente0630 commited on
Commit
2b724ab
1 Parent(s): d2f4080

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +99 -0
README.md ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: creativeml-openrail-m
5
+ tags:
6
+ - stable-diffusion
7
+ - stable-diffusion-diffusers
8
+ - diffusers
9
+ - text-to-image
10
+ - fashion
11
+ - ecommerce
12
+ inference: false
13
+ ---
14
+ # MUSINSA-IGO (MUSINSA fashion Image Generative Operator)
15
+ - - -
16
+ ## MUSINSA-IGO 3.0 is a text-to-image generative model that fine-tuned [*Stable Diffusion XL 1.0*](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) with LoRA using street snaps downloaded from the website of [Musinsa](https://www.musinsa.com/app/), a Korean fashion commerce company. This is very useful for generating fashion images.
17
+
18
+ ### Examples
19
+ - - -
20
+ ![assets-01](assets/assets-01.png)
21
+ ![assets-02](assets/assets-02.png)
22
+ ### Notes
23
+ - - -
24
+ * For example, the recommended prompt template is shown below.
25
+
26
+ **Prompt**: RAW photo, fashion photo of *subject*, (high detailed skin:1.2), 8k uhd, dslr, soft lighting, high quality, film grain, Fujifilm XT3
27
+
28
+ **Negative Prompt**: (deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime:1.4), text, close up, cropped, out of frame, the worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck
29
+
30
+ * The source code is available in [this *GitHub* repository](https://github.com/aldente0630/musinsaigo).
31
+
32
+ * It is recommended to apply a cross-attention scale of 0.5 to 0.75 and use a refiner.
33
+
34
+ ### Usage
35
+ - - -
36
+ ```python
37
+ import torch
38
+ from diffusers import DiffusionPipeline
39
+ def make_prompt(prompt: str) -> str:
40
+ prompt_prefix = "RAW photo"
41
+ prompt_suffix = "(high detailed skin:1.2), 8k uhd, dslr, soft lighting, high quality, film grain, Fujifilm XT3"
42
+ return ", ".join([prompt_prefix, prompt, prompt_suffix]).strip()
43
+ def make_negative_prompt(negative_prompt: str) -> str:
44
+ negative_prefix = "(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime:1.4), \
45
+ text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, \
46
+ extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, \
47
+ bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, \
48
+ extra arms, extra legs, fused fingers, too many fingers, long neck"
49
+ return (
50
+ ", ".join([negative_prefix, negative_prompt]).strip()
51
+ if len(negative_prompt) > 0
52
+ else negative_prefix
53
+ )
54
+ device = "cuda" if torch.cuda.is_available() else "cpu"
55
+ model_id = "aldente0630/musinsaigo-3.0"
56
+ pipe = DiffusionPipeline.from_pretrained(
57
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
58
+ )
59
+ pipe = pipe.to(device)
60
+ pipe.load_lora_weights(model_id)
61
+ # Write your prompt here.
62
+ PROMPT = "a korean woman wearing a white t - shirt and black pants with a bear on it"
63
+ NEGATIVE_PROMPT = ""
64
+ # If you're not using a refiner
65
+ image = pipe(
66
+ prompt=make_prompt(PROMPT),
67
+ height=1024,
68
+ width=768,
69
+ num_inference_steps=50,
70
+ guidance_scale=7.5,
71
+ negative_prompt=make_negative_prompt(NEGATIVE_PROMPT),
72
+ cross_attention_kwargs={"scale": 0.75},
73
+ ).images[0]
74
+ # If you're using a refiner
75
+ refiner = DiffusionPipeline.from_pretrained(
76
+ "stabilityai/stable-diffusion-xl-refiner-1.0",
77
+ text_encoder_2=pipe.text_encoder_2,
78
+ vae=pipe.vae,
79
+ torch_dtype=torch.float16,
80
+ )
81
+ refiner = refiner.to(device)
82
+ image = pipe(
83
+ prompt=make_prompt(PROMPT),
84
+ height=1024,
85
+ width=768,
86
+ num_inference_steps=50,
87
+ guidance_scale=7.5,
88
+ negative_prompt=make_negative_prompt(NEGATIVE_PROMPT),
89
+ output_type="latent",
90
+ cross_attention_kwargs={"scale": 0.75},
91
+ )["images"]
92
+ generated_images = refiner(
93
+ prompt=make_prompt(PROMPT),
94
+ image=image,
95
+ num_inference_steps=50,
96
+ )["images"]
97
+ image.save("test.png")
98
+ ```
99
+ ![test](assets/test-01.png)