File size: 1,963 Bytes
7a62402
 
c3fe6bf
7a62402
3d4d2de
 
7a62402
3d4d2de
 
 
49b40f7
c8ff945
7a62402
 
8689b54
7a62402
 
 
a4063f4
7a62402
a662069
7a62402
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ff9d7df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# You Only Sample Once (YOSO)

This algorithm was proposed in You Only Sample Once: Taming One-Step Text-To-Image Synthesis by Self-Cooperative Diffusion GANs. 

This model is fine-tuning from [
PixArt-XL-2-512x512](https://huggingface.co/PixArt-alpha/PixArt-XL-2-512x512), enabling one-step inference to perform text-to-image generation.

We wanna highlight that the YOSO-PixArt was originally trained on 512 resolution. However, we found that we can construct a YOSO that enables generating samples with 1024 resolution by merging with [
PixArt-XL-2-1024-MS](https://huggingface.co/PixArt-alpha/PixArt-XL-2-1024-MS
) (Eq(15) in the paper) as follows:
![Construction](construction.jpg)
The impressive performance indicates the robust generalization ability of our YOSO. 
## usage
```python
import torch
from diffusers import PixArtAlphaPipeline, LCMScheduler, Transformer2DModel, DPMSolverMultistepScheduler

transformer = Transformer2DModel.from_pretrained(
    "Yihong666/yoso_pixart1024", torch_dtype=torch.float16).to('cuda')

pipe = PixArtAlphaPipeline.from_pretrained("PixArt-alpha/PixArt-XL-2-512x512", 
                                           transformer=transformer,
                                           torch_dtype=torch.float16, use_safetensors=True)

pipe = pipe.to('cuda')
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
pipe.scheduler.config.prediction_type = "v_prediction"
generator = torch.manual_seed(318)
imgs = pipe(prompt="Pirate ship trapped in a cosmic maelstrom nebula, rendered in cosmic beach whirlpool engine, volumetric lighting, spectacular, ambient lights, light pollution, cinematic atmosphere, art nouveau style, illustration art artwork by SenseiJaye, intricate detail.",
                    num_inference_steps=1, 
                    num_images_per_prompt = 1,
                    generator = generator,
                    guidance_scale=1.,
                   )[0]
imgs[0]
```
![Ship](ship_1024.jpg)