File size: 17,807 Bytes
43b7e92
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
<!--Copyright 2024 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# ์ฒ ํ•™

๐Ÿงจ Diffusers๋Š” ๋‹ค์–‘ํ•œ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ์—์„œ **์ตœ์‹ ์˜** ์‚ฌ์ „ ํ›ˆ๋ จ๋œ diffusion ๋ชจ๋ธ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
๊ทธ ๋ชฉ์ ์€ ์ถ”๋ก ๊ณผ ํ›ˆ๋ จ์„ ์œ„ํ•œ **๋ชจ๋“ˆ์‹ ํˆด๋ฐ•์Šค**๋กœ ์‚ฌ์šฉ๋˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์šฐ๋ฆฌ๋Š” ์˜ค๋žœ ์‹œ๊ฐ„์— ๊ฒฌ๋”œ ์ˆ˜ ์žˆ๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๊ตฌ์ถ•ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•˜๊ณ , ๋”ฐ๋ผ์„œ API ์„ค๊ณ„๋ฅผ ๋งค์šฐ ์ค‘์š”์‹œํ•ฉ๋‹ˆ๋‹ค.

๊ฐ„๋‹จํžˆ ๋งํ•ด์„œ, Diffusers๋Š” PyTorch์˜ ์ž์—ฐ์Šค๋Ÿฌ์šด ํ™•์žฅ์ด ๋˜๋„๋ก ๊ตฌ์ถ•๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋Œ€๋ถ€๋ถ„์˜ ์„ค๊ณ„ ์„ ํƒ์€ [PyTorch์˜ ์„ค๊ณ„ ์›์น™](https://pytorch.org/docs/stable/community/design.html#pytorch-design-philosophy)์— ๊ธฐ๋ฐ˜ํ•ฉ๋‹ˆ๋‹ค. ์ด์ œ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ๊ฒƒ๋“ค์„ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค:

## ์„ฑ๋Šฅ๋ณด๋‹ค๋Š” ์‚ฌ์šฉ์„ฑ์„

- Diffusers๋Š” ๋งŽ์€ ๋‚ด์žฅ ์„ฑ๋Šฅ ํ–ฅ์ƒ ๊ธฐ๋Šฅ์„ ๊ฐ–๊ณ  ์žˆ์ง€๋งŒ (์ž์„ธํ•œ ๋‚ด์šฉ์€ [๋ฉ”๋ชจ๋ฆฌ์™€ ์†๋„](https://huggingface.co/docs/diffusers/optimization/fp16) ์ฐธ์กฐ), ๋ชจ๋ธ์€ ํ•ญ์ƒ ๊ฐ€์žฅ ๋†’์€ ์ •๋ฐ€๋„์™€ ์ตœ์†Œํ•œ์˜ ์ตœ์ ํ™”๋กœ ๋กœ๋“œ๋ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๊ธฐ๋ณธ์ ์ธ diffusion ํŒŒ์ดํ”„๋ผ์ธ์€ ๋”ฐ๋กœ ์ •์˜ํ•˜์ง€ ์•Š๋Š”๋‹ค๋ฉด CPU์—์„œ float32 ์ •๋ฐ€๋„๋กœ ์ธ์Šคํ„ด์Šคํ™”๋ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๋‹ค์–‘ํ•œ ํ”Œ๋žซํผ๊ณผ ๊ฐ€์†๊ธฐ์—์„œ์˜ ์‚ฌ์šฉ์„ฑ์„ ๋ณด์žฅํ•˜๋ฉฐ, ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ๋ณต์žกํ•œ ์„ค์น˜๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š์Œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.
- Diffusers๋Š” **๊ฐ€๋ฒผ์šด** ํŒจํ‚ค์ง€๋ฅผ ์ง€ํ–ฅํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ•„์ˆ˜ ์ข…์†์„ฑ์€ ๊ฑฐ์˜ ์—†์ง€๋งŒ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ๋งŽ์€ ์„ ํƒ์  ์ข…์†์„ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค (`accelerate`, `safetensors`, `onnx` ๋“ฑ). ์ €ํฌ๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๊ฐ€๋Šฅํ•œ ํ•œ ๊ฐ€๋ณ๊ฒŒ ์œ ์ง€ํ•˜์—ฌ ๋‹ค๋ฅธ ํŒจํ‚ค์ง€์— ๋Œ€ํ•œ ์ข…์†์„ฑ ๊ฑฑ์ •์ด ์—†๋„๋ก ๋…ธ๋ ฅํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
- Diffusers๋Š” ๊ฐ„๊ฒฐํ•˜๊ณ  ์ดํ•ดํ•˜๊ธฐ ์‰ฌ์šด ์ฝ”๋“œ๋ฅผ ์„ ํ˜ธํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๋žŒ๋‹ค ํ•จ์ˆ˜๋‚˜ ๊ณ ๊ธ‰ PyTorch ์—ฐ์‚ฐ์ž์™€ ๊ฐ™์€ ์••์ถ•๋œ ์ฝ”๋“œ ๊ตฌ๋ฌธ์„ ์ž์ฃผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

## ์‰ฌ์›€๋ณด๋‹ค๋Š” ๊ฐ„๋‹จํ•จ์„

PyTorch์—์„œ๋Š” **๋ช…์‹œ์ ์ธ ๊ฒƒ์ด ์•”์‹œ์ ์ธ ๊ฒƒ๋ณด๋‹ค ๋‚ซ๋‹ค**์™€ **๋‹จ์ˆœํ•œ ๊ฒƒ์ด ๋ณต์žกํ•œ ๊ฒƒ๋ณด๋‹ค ๋‚ซ๋‹ค**๋ผ๊ณ  ๋งํ•ฉ๋‹ˆ๋‹ค. ์ด ์„ค๊ณ„ ์ฒ ํ•™์€ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์˜ ์—ฌ๋Ÿฌ ๋ถ€๋ถ„์— ๋ฐ˜์˜๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค:
- [`DiffusionPipeline.to`](https://huggingface.co/docs/diffusers/main/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.to)์™€ ๊ฐ™์€ ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž๊ฐ€ ์žฅ์น˜ ๊ด€๋ฆฌ๋ฅผ ํ•  ์ˆ˜ ์žˆ๋„๋ก PyTorch์˜ API๋ฅผ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.
- ์ž˜๋ชป๋œ ์ž…๋ ฅ์„ ์กฐ์šฉํžˆ ์ˆ˜์ •ํ•˜๋Š” ๋Œ€์‹  ๊ฐ„๊ฒฐํ•œ ์˜ค๋ฅ˜ ๋ฉ”์‹œ์ง€๋ฅผ ๋ฐœ์ƒ์‹œํ‚ค๋Š” ๊ฒƒ์ด ์šฐ์„ ์ž…๋‹ˆ๋‹ค. Diffusers๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๊ฐ€๋Šฅํ•œ ํ•œ ์‰ฝ๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค ์‚ฌ์šฉ์ž๋ฅผ ๊ฐ€๋ฅด์น˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•ฉ๋‹ˆ๋‹ค.
- ๋ณต์žกํ•œ ๋ชจ๋ธ๊ณผ ์Šค์ผ€์ค„๋Ÿฌ ๋กœ์ง์ด ๋‚ด๋ถ€์—์„œ ๋งˆ๋ฒ•์ฒ˜๋Ÿผ ์ฒ˜๋ฆฌํ•˜๋Š” ๋Œ€์‹  ๋…ธ์ถœ๋ฉ๋‹ˆ๋‹ค. ์Šค์ผ€์ค„๋Ÿฌ/์ƒ˜ํ”Œ๋Ÿฌ๋Š” ์„œ๋กœ์—๊ฒŒ ์ตœ์†Œํ•œ์˜ ์ข…์†์„ฑ์„ ๊ฐ€์ง€๊ณ  ๋ถ„๋ฆฌ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋กœ์จ ์‚ฌ์šฉ์ž๋Š” ์–ธ๋กค๋œ ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ๋ฃจํ”„๋ฅผ ์ž‘์„ฑํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด ๋ถ„๋ฆฌ๋Š” ๋””๋ฒ„๊น…์„ ๋” ์‰ฝ๊ฒŒํ•˜๊ณ  ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ๊ณผ์ •์„ ์กฐ์ •ํ•˜๊ฑฐ๋‚˜ diffusers ๋ชจ๋ธ์ด๋‚˜ ์Šค์ผ€์ค„๋Ÿฌ๋ฅผ ๊ต์ฒดํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ์ž์—๊ฒŒ ๋” ๋งŽ์€ ์ œ์–ด๊ถŒ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
- diffusers ํŒŒ์ดํ”„๋ผ์ธ์˜ ๋”ฐ๋กœ ํ›ˆ๋ จ๋œ ๊ตฌ์„ฑ ์š”์†Œ์ธ text encoder, unet ๋ฐ variational autoencoder๋Š” ๊ฐ๊ฐ ์ž์ฒด ๋ชจ๋ธ ํด๋ž˜์Šค๋ฅผ ๊ฐ–์Šต๋‹ˆ๋‹ค. ์ด๋กœ์จ ์‚ฌ์šฉ์ž๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ๋ชจ๋ธ์˜ ๊ตฌ์„ฑ ์š”์†Œ ๊ฐ„์˜ ์ƒํ˜ธ ์ž‘์šฉ์„ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•˜๋ฉฐ, ์ง๋ ฌํ™” ํ˜•์‹์€ ๋ชจ๋ธ ๊ตฌ์„ฑ ์š”์†Œ๋ฅผ ๋‹ค๋ฅธ ํŒŒ์ผ๋กœ ๋ถ„๋ฆฌํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๋Š” ๋””๋ฒ„๊น…๊ณผ ์ปค์Šคํ„ฐ๋งˆ์ด์ง•์„ ๋” ์‰ฝ๊ฒŒํ•ฉ๋‹ˆ๋‹ค. DreamBooth๋‚˜ Textual Inversion ํ›ˆ๋ จ์€ Diffusers์˜ 'diffusion ํŒŒ์ดํ”„๋ผ์ธ์˜ ๋‹จ์ผ ๊ตฌ์„ฑ ์š”์†Œ๋“ค์„ ๋ถ„๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” ๋Šฅ๋ ฅ' ๋•๋ถ„์— ๋งค์šฐ ๊ฐ„๋‹จํ•ฉ๋‹ˆ๋‹ค.

## ์ถ”์ƒํ™”๋ณด๋‹ค๋Š” ์ˆ˜์ • ๊ฐ€๋Šฅํ•˜๊ณ  ๊ธฐ์—ฌํ•˜๊ธฐ ์‰ฌ์›€์„

๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์˜ ๋Œ€๋ถ€๋ถ„์— ๋Œ€ํ•ด Diffusers๋Š” [Transformers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ](https://github.com/huggingface/transformers)์˜ ์ค‘์š”ํ•œ ์„ค๊ณ„ ์›์น™์„ ์ฑ„ํƒํ•ฉ๋‹ˆ๋‹ค, ๋ฐ”๋กœ ์„ฑ๊ธ‰ํ•œ ์ถ”์ƒํ™”๋ณด๋‹ค๋Š” copy-pasted ์ฝ”๋“œ๋ฅผ ์„ ํ˜ธํ•œ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด ์„ค๊ณ„ ์›์น™์€ [Don't repeat yourself (DRY)](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself)์™€ ๊ฐ™์€ ์ธ๊ธฐ ์žˆ๋Š” ์„ค๊ณ„ ์›์น™๊ณผ๋Š” ๋Œ€์กฐ์ ์œผ๋กœ ๋งค์šฐ ์˜๊ฒฌ์ด ๋ถ„๋ถ„ํ•œ๋ฐ์š”.
๊ฐ„๋‹จํžˆ ๋งํ•ด์„œ, Transformers๊ฐ€ ๋ชจ๋ธ๋ง ํŒŒ์ผ์— ๋Œ€ํ•ด ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ, Diffusers๋Š” ๋งค์šฐ ๋‚ฎ์€ ์ˆ˜์ค€์˜ ์ถ”์ƒํ™”์™€ ๋งค์šฐ ๋…๋ฆฝ์ ์ธ ์ฝ”๋“œ๋ฅผ ์œ ์ง€ํ•˜๋Š” ๊ฒƒ์„ ์„ ํ˜ธํ•ฉ๋‹ˆ๋‹ค. ํ•จ์ˆ˜, ๊ธด ์ฝ”๋“œ ๋ธ”๋ก, ์‹ฌ์ง€์–ด ํด๋ž˜์Šค๋„ ์—ฌ๋Ÿฌ ํŒŒ์ผ์— ๋ณต์‚ฌํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด๋Š” ์ฒ˜์Œ์—๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์œ ์ง€ํ•  ์ˆ˜ ์—†๊ฒŒ ๋งŒ๋“œ๋Š” ๋‚˜์œ, ์„œํˆฌ๋ฅธ ์„ค๊ณ„ ์„ ํƒ์œผ๋กœ ๋ณด์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋Ÿฌํ•œ ์„ค๊ณ„๋Š” ๋งค์šฐ ์„ฑ๊ณต์ ์ด๋ฉฐ, ์ปค๋ฎค๋‹ˆํ‹ฐ ๊ธฐ๋ฐ˜์˜ ์˜คํ”ˆ ์†Œ์Šค ๊ธฐ๊ณ„ ํ•™์Šต ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ๋งค์šฐ ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ์ด์œ ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:
- ๊ธฐ๊ณ„ ํ•™์Šต์€ ํŒจ๋Ÿฌ๋‹ค์ž„, ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜ ๋ฐ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ๋น ๋ฅด๊ฒŒ ๋ณ€ํ™”ํ•˜๋Š” ๋งค์šฐ ๋น ๋ฅด๊ฒŒ ์›€์ง์ด๋Š” ๋ถ„์•ผ์ด๊ธฐ ๋•Œ๋ฌธ์— ์˜ค๋žœ ๊ธฐ๊ฐ„ ์ง€์†๋˜๋Š” ์ฝ”๋“œ ์ถ”์ƒํ™”๋ฅผ ์ •์˜ํ•˜๊ธฐ๊ฐ€ ๋งค์šฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.
- ๊ธฐ๊ณ„ ํ•™์Šต ์ „๋ฌธ๊ฐ€๋“ค์€ ์•„์ด๋””์–ด์™€ ์—ฐ๊ตฌ๋ฅผ ์œ„ํ•ด ๊ธฐ์กด ์ฝ”๋“œ๋ฅผ ๋น ๋ฅด๊ฒŒ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•˜๋ฏ€๋กœ, ๋งŽ์€ ์ถ”์ƒํ™”๋ณด๋‹ค๋Š” ๋…๋ฆฝ์ ์ธ ์ฝ”๋“œ๋ฅผ ์„ ํ˜ธํ•ฉ๋‹ˆ๋‹ค.
- ์˜คํ”ˆ ์†Œ์Šค ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” ์ปค๋ฎค๋‹ˆํ‹ฐ ๊ธฐ์—ฌ์— ์˜์กดํ•˜๋ฏ€๋กœ, ๊ธฐ์—ฌํ•˜๊ธฐ ์‰ฌ์šด ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๊ตฌ์ถ•ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ฝ”๋“œ๊ฐ€ ์ถ”์ƒํ™”๋˜๋ฉด ์˜์กด์„ฑ์ด ๋งŽ์•„์ง€๊ณ  ์ฝ๊ธฐ ์–ด๋ ต๊ณ  ๊ธฐ์—ฌํ•˜๊ธฐ ์–ด๋ ค์›Œ์ง‘๋‹ˆ๋‹ค. ๊ธฐ์—ฌ์ž๋“ค์€ ์ค‘์š”ํ•œ ๊ธฐ๋Šฅ์„ ๋ง๊ฐ€๋œจ๋ฆด๊นŒ ๋‘๋ ค์›Œํ•˜์—ฌ ๋งค์šฐ ์ถ”์ƒํ™”๋œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ๊ธฐ์—ฌํ•˜์ง€ ์•Š๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ๊ธฐ์—ฌํ•˜๋Š” ๊ฒƒ์ด ๋‹ค๋ฅธ ๊ธฐ๋ณธ ์ฝ”๋“œ๋ฅผ ๋ง๊ฐ€๋œจ๋ฆด ์ˆ˜ ์—†๋‹ค๋ฉด, ์ž ์žฌ์ ์ธ ์ƒˆ๋กœ์šด ๊ธฐ์—ฌ์ž์—๊ฒŒ ๋”์šฑ ํ™˜์˜๋ฐ›์„ ์ˆ˜ ์žˆ์„ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์—ฌ๋Ÿฌ ๋ถ€๋ถ„์— ๋Œ€ํ•ด ๋ณ‘๋ ฌ์ ์œผ๋กœ ๊ฒ€ํ† ํ•˜๊ณ  ๊ธฐ์—ฌํ•˜๊ธฐ๊ฐ€ ๋” ์‰ฌ์›Œ์ง‘๋‹ˆ๋‹ค.

Hugging Face์—์„œ๋Š” ์ด ์„ค๊ณ„๋ฅผ **๋‹จ์ผ ํŒŒ์ผ ์ •์ฑ…**์ด๋ผ๊ณ  ๋ถ€๋ฅด๋ฉฐ, ํŠน์ • ํด๋ž˜์Šค์˜ ๋Œ€๋ถ€๋ถ„์˜ ์ฝ”๋“œ๊ฐ€ ๋‹จ์ผํ•˜๊ณ  ๋…๋ฆฝ์ ์ธ ํŒŒ์ผ์— ์ž‘์„ฑ๋˜์–ด์•ผ ํ•œ๋‹ค๋Š” ์˜๋ฏธ์ž…๋‹ˆ๋‹ค. ์ฒ ํ•™์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด๋ ค๋ฉด [์ด ๋ธ”๋กœ๊ทธ ๊ธ€](https://huggingface.co/blog/transformers-design-philosophy)์„ ์ฐธ์กฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Diffusers์—์„œ๋Š” ์ด๋Ÿฌํ•œ ์ฒ ํ•™์„ ํŒŒ์ดํ”„๋ผ์ธ๊ณผ ์Šค์ผ€์ค„๋Ÿฌ์— ๋ชจ๋‘ ๋”ฐ๋ฅด์ง€๋งŒ, diffusion ๋ชจ๋ธ์— ๋Œ€ํ•ด์„œ๋Š” ์ผ๋ถ€๋งŒ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค. ์ผ๋ถ€๋งŒ ๋”ฐ๋ฅด๋Š” ์ด์œ ๋Š” Diffusion ํŒŒ์ดํ”„๋ผ์ธ์ธ [DDPM](https://huggingface.co/docs/diffusers/api/pipelines/ddpm), [Stable Diffusion](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/overview#stable-diffusion-pipelines), [unCLIP (DALLยทE 2)](https://huggingface.co/docs/diffusers/api/pipelines/unclip) ๋ฐ [Imagen](https://imagen.research.google/) ๋“ฑ ๋Œ€๋ถ€๋ถ„์˜ diffusion ํŒŒ์ดํ”„๋ผ์ธ์€ ๋™์ผํ•œ diffusion ๋ชจ๋ธ์ธ [UNet](https://huggingface.co/docs/diffusers/api/models/unet2d-cond)์— ์˜์กดํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

์ข‹์•„์š”, ์ด์ œ ๐Ÿงจ Diffusers๊ฐ€ ์„ค๊ณ„๋œ ๋ฐฉ์‹์„ ๋Œ€๋žต์ ์œผ๋กœ ์ดํ•ดํ–ˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค ๐Ÿค—.
์šฐ๋ฆฌ๋Š” ์ด๋Ÿฌํ•œ ์„ค๊ณ„ ์›์น™์„ ์ผ๊ด€๋˜๊ฒŒ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์ „์ฒด์— ์ ์šฉํ•˜๋ ค๊ณ  ๋…ธ๋ ฅํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿผ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ์ฒ ํ•™์— ๋Œ€ํ•œ ์ผ๋ถ€ ์˜ˆ์™ธ ์‚ฌํ•ญ์ด๋‚˜ ๋ถˆํ–‰ํ•œ ์„ค๊ณ„ ์„ ํƒ์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋””์ž์ธ์— ๋Œ€ํ•œ ํ”ผ๋“œ๋ฐฑ์ด ์žˆ๋‹ค๋ฉด [GitHub์—์„œ ์ง์ ‘](https://github.com/huggingface/diffusers/issues/new?assignees=&labels=&template=feedback.md&title=) ์•Œ๋ ค์ฃผ์‹œ๋ฉด ๊ฐ์‚ฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

## ๋””์ž์ธ ์ฒ ํ•™ ์ž์„ธํžˆ ์•Œ์•„๋ณด๊ธฐ

์ด์ œ ๋””์ž์ธ ์ฒ ํ•™์˜ ์„ธ๋ถ€ ์‚ฌํ•ญ์„ ์ข€ ๋” ์ž์„ธํžˆ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. Diffusers๋Š” ์ฃผ๋กœ ์„ธ ๊ฐ€์ง€ ์ฃผ์š” ํด๋ž˜์Šค๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค: [ํŒŒ์ดํ”„๋ผ์ธ](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines), [๋ชจ๋ธ](https://github.com/huggingface/diffusers/tree/main/src/diffusers/models), ๊ทธ๋ฆฌ๊ณ  [์Šค์ผ€์ค„๋Ÿฌ](https://github.com/huggingface/diffusers/tree/main/src/diffusers/schedulers). ๊ฐ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ๋” ์ž์„ธํ•œ ์„ค๊ณ„ ๊ฒฐ์ • ์‚ฌํ•ญ์„ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

### ํŒŒ์ดํ”„๋ผ์ธ

ํŒŒ์ดํ”„๋ผ์ธ์€ ์‚ฌ์šฉํ•˜๊ธฐ ์‰ฝ๋„๋ก ์„ค๊ณ„๋˜์—ˆ์œผ๋ฉฐ (๋”ฐ๋ผ์„œ [*์‰ฌ์›€๋ณด๋‹ค๋Š” ๊ฐ„๋‹จํ•จ์„*](#์‰ฌ์›€๋ณด๋‹ค๋Š”-๊ฐ„๋‹จํ•จ์„)์„ 100% ๋”ฐ๋ฅด์ง€๋Š” ์•Š์Œ), feature-completeํ•˜์ง€ ์•Š์œผ๋ฉฐ, ์ถ”๋ก ์„ ์œ„ํ•œ [๋ชจ๋ธ](#๋ชจ๋ธ)๊ณผ [์Šค์ผ€์ค„๋Ÿฌ](#์Šค์ผ€์ค„๋Ÿฌ)๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์˜ ์˜ˆ์‹œ๋กœ ๊ฐ„์ฃผ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‹ค์Œ๊ณผ ๊ฐ™์€ ์„ค๊ณ„ ์›์น™์„ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค:
- ํŒŒ์ดํ”„๋ผ์ธ์€ ๋‹จ์ผ ํŒŒ์ผ ์ •์ฑ…์„ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค. ๋ชจ๋“  ํŒŒ์ดํ”„๋ผ์ธ์€ src/diffusers/pipelines์˜ ๊ฐœ๋ณ„ ๋””๋ ‰ํ† ๋ฆฌ์— ์žˆ์Šต๋‹ˆ๋‹ค. ํ•˜๋‚˜์˜ ํŒŒ์ดํ”„๋ผ์ธ ํด๋”๋Š” ํ•˜๋‚˜์˜ diffusion ๋…ผ๋ฌธ/ํ”„๋กœ์ ํŠธ/๋ฆด๋ฆฌ์Šค์— ํ•ด๋‹นํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ํŒŒ์ดํ”„๋ผ์ธ ํŒŒ์ผ์€ ํ•˜๋‚˜์˜ ํŒŒ์ดํ”„๋ผ์ธ ํด๋”์— ๋ชจ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด [`src/diffusers/pipelines/stable-diffusion`](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines/stable_diffusion)์—์„œ ๊ทธ๋ ‡๊ฒŒ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ํŒŒ์ดํ”„๋ผ์ธ์ด ์œ ์‚ฌํ•œ ๊ธฐ๋Šฅ์„ ๊ณต์œ ํ•˜๋Š” ๊ฒฝ์šฐ, [#Copied from mechanism](https://github.com/huggingface/diffusers/blob/125d783076e5bd9785beb05367a2d2566843a271/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py#L251)์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
- ํŒŒ์ดํ”„๋ผ์ธ์€ ๋ชจ๋‘ [`DiffusionPipeline`]์„ ์ƒ์†ํ•ฉ๋‹ˆ๋‹ค.
- ๊ฐ ํŒŒ์ดํ”„๋ผ์ธ์€ ์„œ๋กœ ๋‹ค๋ฅธ ๋ชจ๋ธ ๋ฐ ์Šค์ผ€์ค„๋Ÿฌ ๊ตฌ์„ฑ ์š”์†Œ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์œผ๋ฉฐ, ์ด๋Š” [`model_index.json` ํŒŒ์ผ](https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/model_index.json)์— ๋ฌธ์„œํ™”๋˜์–ด ์žˆ์œผ๋ฉฐ, ํŒŒ์ดํ”„๋ผ์ธ์˜ ์†์„ฑ ์ด๋ฆ„๊ณผ ๋™์ผํ•œ ์ด๋ฆ„์œผ๋กœ ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, [`DiffusionPipeline.components`](https://huggingface.co/docs/diffusers/main/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.components) ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด ํŒŒ์ดํ”„๋ผ์ธ ๊ฐ„์— ๊ณต์œ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
- ๊ฐ ํŒŒ์ดํ”„๋ผ์ธ์€ [`DiffusionPipeline.from_pretrained`](https://huggingface.co/docs/diffusers/main/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.from_pretrained) ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด ๋กœ๋“œํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
- ํŒŒ์ดํ”„๋ผ์ธ์€ ์ถ”๋ก ์—**๋งŒ** ์‚ฌ์šฉ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
- ํŒŒ์ดํ”„๋ผ์ธ์€ ๋งค์šฐ ๊ฐ€๋…์„ฑ์ด ์ข‹๊ณ , ์ดํ•ดํ•˜๊ธฐ ์‰ฝ๊ณ , ์‰ฝ๊ฒŒ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
- ํŒŒ์ดํ”„๋ผ์ธ์€ ์„œ๋กœ ์ƒํ˜ธ์ž‘์šฉํ•˜๊ณ , ์ƒ์œ„ ์ˆ˜์ค€ API์— ์‰ฝ๊ฒŒ ํ†ตํ•ฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
- ํŒŒ์ดํ”„๋ผ์ธ์€ ์‚ฌ์šฉ์ž ์ธํ„ฐํŽ˜์ด์Šค๊ฐ€ feature-completeํ•˜์ง€ ์•Š๊ฒŒ ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•ฉ๋‹ˆ๋‹ค. future-completeํ•œ ์‚ฌ์šฉ์ž ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์›ํ•œ๋‹ค๋ฉด [InvokeAI](https://github.com/invoke-ai/InvokeAI), [Diffuzers](https://github.com/abhishekkrthakur/diffuzers), [lama-cleaner](https://github.com/Sanster/lama-cleaner)๋ฅผ ์ฐธ์กฐํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
- ๋ชจ๋“  ํŒŒ์ดํ”„๋ผ์ธ์€ ์˜ค๋กœ์ง€ `__call__` ๋ฉ”์„œ๋“œ๋ฅผ ํ†ตํ•ด ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. `__call__` ์ธ์ž์˜ ์ด๋ฆ„์€ ๋ชจ๋“  ํŒŒ์ดํ”„๋ผ์ธ์—์„œ ๊ณต์œ ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
- ํŒŒ์ดํ”„๋ผ์ธ์€ ํ•ด๊ฒฐํ•˜๊ณ ์ž ํ•˜๋Š” ์ž‘์—…์˜ ์ด๋ฆ„์œผ๋กœ ์ง€์ •๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
- ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ์— ์ƒˆ๋กœ์šด diffusion ํŒŒ์ดํ”„๋ผ์ธ์€ ์ƒˆ๋กœ์šด ํŒŒ์ดํ”„๋ผ์ธ ํด๋”/ํŒŒ์ผ์— ๊ตฌํ˜„๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

### ๋ชจ๋ธ

๋ชจ๋ธ์€ [PyTorch์˜ Module ํด๋ž˜์Šค](https://pytorch.org/docs/stable/generated/torch.nn.Module.html)์˜ ์ž์—ฐ์Šค๋Ÿฌ์šด ํ™•์žฅ์ด ๋˜๋„๋ก, ๊ตฌ์„ฑ ๊ฐ€๋Šฅํ•œ ํˆด๋ฐ•์Šค๋กœ ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋ชจ๋ธ์€ **๋‹จ์ผ ํŒŒ์ผ ์ •์ฑ…**์„ ์ผ๋ถ€๋งŒ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.

๋‹ค์Œ๊ณผ ๊ฐ™์€ ์„ค๊ณ„ ์›์น™์„ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค:
- ๋ชจ๋ธ์€ **๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜ ์œ ํ˜•**์— ํ•ด๋‹นํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด [`UNet2DConditionModel`] ํด๋ž˜์Šค๋Š” 2D ์ด๋ฏธ์ง€ ์ž…๋ ฅ์„ ๊ธฐ๋Œ€ํ•˜๊ณ  ์ผ๋ถ€ context์— ์˜์กดํ•˜๋Š” ๋ชจ๋“  UNet ๋ณ€ํ˜•๋“ค์— ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
- ๋ชจ๋“  ๋ชจ๋ธ์€ [`src/diffusers/models`](https://github.com/huggingface/diffusers/tree/main/src/diffusers/models)์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๊ฐ ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜๋Š” ํ•ด๋‹น ํŒŒ์ผ์— ์ •์˜๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด [`unet_2d_condition.py`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/unet_2d_condition.py), [`transformer_2d.py`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/transformer_2d.py) ๋“ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
- ๋ชจ๋ธ์€ **๋‹จ์ผ ํŒŒ์ผ ์ •์ฑ…**์„ ๋”ฐ๋ฅด์ง€ ์•Š์œผ๋ฉฐ, [`attention.py`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention.py), [`resnet.py`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/resnet.py), [`embeddings.py`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/embeddings.py) ๋“ฑ๊ณผ ๊ฐ™์€ ์ž‘์€ ๋ชจ๋ธ ๊ตฌ์„ฑ ์š”์†Œ๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. **์ฐธ๊ณ **: ์ด๋Š” Transformers์˜ ๋ชจ๋ธ๋ง ํŒŒ์ผ๊ณผ๋Š” ๋Œ€์กฐ์ ์œผ๋กœ ๋ชจ๋ธ์ด ์‹ค์ œ๋กœ ๋‹จ์ผ ํŒŒ์ผ ์ •์ฑ…์„ ๋”ฐ๋ฅด์ง€ ์•Š์Œ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
- ๋ชจ๋ธ์€ PyTorch์˜ `Module` ํด๋ž˜์Šค์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ๋ณต์žก์„ฑ์„ ๋…ธ์ถœํ•˜๊ณ  ๋ช…ํ™•ํ•œ ์˜ค๋ฅ˜ ๋ฉ”์‹œ์ง€๋ฅผ ์ œ๊ณตํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
- ๋ชจ๋“  ๋ชจ๋ธ์€ `ModelMixin`๊ณผ `ConfigMixin`์„ ์ƒ์†ํ•ฉ๋‹ˆ๋‹ค.
- ๋ชจ๋ธ์€ ์ฃผ์š” ์ฝ”๋“œ ๋ณ€๊ฒฝ์ด ํ•„์š”ํ•˜์ง€ ์•Š๊ณ , ์—ญํ˜ธํ™˜์„ฑ์„ ์œ ์ง€ํ•˜๋ฉฐ, ๋ฉ”๋ชจ๋ฆฌ ๋˜๋Š” ์ปดํ“จํŒ…๊ณผ ๊ด€๋ จํ•œ ์ค‘์š”ํ•œ ์ด๋“์„ ์ œ๊ณตํ•  ๋•Œ ์„ฑ๋Šฅ์„ ์œ„ํ•ด ์ตœ์ ํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
- ๋ชจ๋ธ์€ ๊ธฐ๋ณธ์ ์œผ๋กœ ๊ฐ€์žฅ ๋†’์€ ์ •๋ฐ€๋„์™€ ๊ฐ€์žฅ ๋‚ฎ์€ ์„ฑ๋Šฅ ์„ค์ •์„ ๊ฐ€์ ธ์•ผ ํ•ฉ๋‹ˆ๋‹ค.
- Diffusers์— ์ด๋ฏธ ์žˆ๋Š” ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜๋กœ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ๋Š” ์ƒˆ๋กœ์šด ๋ชจ๋ธ ์ฒดํฌํฌ์ธํŠธ๋ฅผ ํ†ตํ•ฉํ•  ๋•Œ๋Š” ๊ธฐ์กด ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ƒˆ๋กœ์šด ์ฒดํฌํฌ์ธํŠธ์™€ ํ˜ธํ™˜๋˜๋„๋ก ์ˆ˜์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ƒˆ๋กœ์šด ํŒŒ์ผ์„ ๋งŒ๋“ค์–ด์•ผ ํ•˜๋Š” ๊ฒฝ์šฐ๋Š” ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜๊ฐ€ ๊ทผ๋ณธ์ ์œผ๋กœ ๋‹ค๋ฅธ ๊ฒฝ์šฐ์—๋งŒ ํ•ด๋‹นํ•ฉ๋‹ˆ๋‹ค.
- ๋ชจ๋ธ์€ ๋ฏธ๋ž˜์˜ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์„ ์‰ฝ๊ฒŒ ํ™•์žฅํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๊ณต๊ฐœ ํ•จ์ˆ˜ ์ธ์ˆ˜๋“ค๊ณผ ๊ตฌ์„ฑ ์ธ์ˆ˜๋“ค์„ ์ œํ•œํ•˜๊ณ ,๋ฏธ๋ž˜์˜ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์„ "์˜ˆ์ƒ"ํ•˜๋Š” ๊ฒƒ์„ ํ†ตํ•ด ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๋ถˆ๋ฆฌ์–ธ `is_..._type` ์ธ์ˆ˜๋ณด๋‹ค๋Š” ์ƒˆ๋กœ์šด ๋ฏธ๋ž˜ ์œ ํ˜•์— ์‰ฝ๊ฒŒ ํ™•์žฅํ•  ์ˆ˜ ์žˆ๋Š” ๋ฌธ์ž์—ด "...type" ์ธ์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์ด ์ผ๋ฐ˜์ ์œผ๋กœ ๋” ์ข‹์Šต๋‹ˆ๋‹ค. ์ƒˆ๋กœ์šด ๋ชจ๋ธ ์ฒดํฌํฌ์ธํŠธ๊ฐ€ ์ž‘๋™ํ•˜๋„๋ก ํ•˜๊ธฐ ์œ„ํ•ด ๊ธฐ์กด ์•„ํ‚คํ…์ฒ˜์— ์ตœ์†Œํ•œ์˜ ๋ณ€๊ฒฝ๋งŒ์„ ๊ฐ€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
- ๋ชจ๋ธ ๋””์ž์ธ์€ ์ฝ”๋“œ์˜ ๊ฐ€๋…์„ฑ๊ณผ ๊ฐ„๊ฒฐ์„ฑ์„ ์œ ์ง€ํ•˜๋Š” ๊ฒƒ๊ณผ ๋งŽ์€ ๋ชจ๋ธ ์ฒดํฌํฌ์ธํŠธ๋ฅผ ์ง€์›ํ•˜๋Š” ๊ฒƒ ์‚ฌ์ด์˜ ์–ด๋ ค์šด ๊ท ํ˜• ์กฐ์ ˆ์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ๋ง ์ฝ”๋“œ์˜ ๋Œ€๋ถ€๋ถ„์€ ์ƒˆ๋กœ์šด ๋ชจ๋ธ ์ฒดํฌํฌ์ธํŠธ๋ฅผ ์œ„ํ•ด ํด๋ž˜์Šค๋ฅผ ์ˆ˜์ •ํ•˜๋Š” ๊ฒƒ์ด ์ข‹์ง€๋งŒ, [UNet ๋ธ”๋ก](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/unet_2d_blocks.py) ๋ฐ [Attention ํ”„๋กœ์„ธ์„œ](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py)์™€ ๊ฐ™์ด ์ฝ”๋“œ๋ฅผ ์žฅ๊ธฐ์ ์œผ๋กœ ๊ฐ„๊ฒฐํ•˜๊ณ  ์ฝ๊ธฐ ์‰ฝ๊ฒŒ ์œ ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ์ƒˆ๋กœ์šด ํด๋ž˜์Šค๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ์˜ˆ์™ธ๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

### ์Šค์ผ€์ค„๋Ÿฌ

์Šค์ผ€์ค„๋Ÿฌ๋Š” ์ถ”๋ก ์„ ์œ„ํ•œ ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ๊ณผ์ •์„ ์•ˆ๋‚ดํ•˜๊ณ  ํ›ˆ๋ จ์„ ์œ„ํ•œ ๋…ธ์ด์ฆˆ ์Šค์ผ€์ค„์„ ์ •์˜ํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค. ์Šค์ผ€์ค„๋Ÿฌ๋Š” ๊ฐœ๋ณ„ ํด๋ž˜์Šค๋กœ ์„ค๊ณ„๋˜์–ด ์žˆ์œผ๋ฉฐ, ๋กœ๋“œ ๊ฐ€๋Šฅํ•œ ๊ตฌ์„ฑ ํŒŒ์ผ๊ณผ **๋‹จ์ผ ํŒŒ์ผ ์ •์ฑ…**์„ ์—„๊ฒฉํžˆ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.

๋‹ค์Œ๊ณผ ๊ฐ™์€ ์„ค๊ณ„ ์›์น™์„ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค:
- ๋ชจ๋“  ์Šค์ผ€์ค„๋Ÿฌ๋Š” [`src/diffusers/schedulers`](https://github.com/huggingface/diffusers/tree/main/src/diffusers/schedulers)์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
- ์Šค์ผ€์ค„๋Ÿฌ๋Š” ํฐ ์œ ํ‹ธ๋ฆฌํ‹ฐ ํŒŒ์ผ์—์„œ ๊ฐ€์ ธ์˜ค์ง€ **์•Š์•„์•ผ** ํ•˜๋ฉฐ, ์ž์ฒด ํฌํ•จ์„ฑ์„ ์œ ์ง€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
- ํ•˜๋‚˜์˜ ์Šค์ผ€์ค„๋Ÿฌ Python ํŒŒ์ผ์€ ํ•˜๋‚˜์˜ ์Šค์ผ€์ค„๋Ÿฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜(๋…ผ๋ฌธ์—์„œ ์ •์˜๋œ ๊ฒƒ๊ณผ ๊ฐ™์€)์— ํ•ด๋‹นํ•ฉ๋‹ˆ๋‹ค.
- ์Šค์ผ€์ค„๋Ÿฌ๊ฐ€ ์œ ์‚ฌํ•œ ๊ธฐ๋Šฅ์„ ๊ณต์œ ํ•˜๋Š” ๊ฒฝ์šฐ, `#Copied from` ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
- ๋ชจ๋“  ์Šค์ผ€์ค„๋Ÿฌ๋Š” `SchedulerMixin`๊ณผ `ConfigMixin`์„ ์ƒ์†ํ•ฉ๋‹ˆ๋‹ค.
- [`ConfigMixin.from_config`](https://huggingface.co/docs/diffusers/main/en/api/configuration#diffusers.ConfigMixin.from_config) ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์Šค์ผ€์ค„๋Ÿฌ๋ฅผ ์‰ฝ๊ฒŒ ๊ต์ฒดํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ [์—ฌ๊ธฐ](../using-diffusers/schedulers.md)์—์„œ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.
- ๋ชจ๋“  ์Šค์ผ€์ค„๋Ÿฌ๋Š” `set_num_inference_steps`์™€ `step` ํ•จ์ˆ˜๋ฅผ ๊ฐ€์ ธ์•ผ ํ•ฉ๋‹ˆ๋‹ค. `set_num_inference_steps(...)`๋Š” ๊ฐ ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ๊ณผ์ •(์ฆ‰, `step(...)`์ด ํ˜ธ์ถœ๋˜๊ธฐ ์ „) ์ด์ „์— ํ˜ธ์ถœ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
- ๊ฐ ์Šค์ผ€์ค„๋Ÿฌ๋Š” ๋ชจ๋ธ์ด ํ˜ธ์ถœ๋  ํƒ€์ž„์Šคํ…์˜ ๋ฐฐ์—ด์ธ `timesteps` ์†์„ฑ์„ ํ†ตํ•ด ๋ฃจํ”„๋ฅผ ๋Œ ์ˆ˜ ์žˆ๋Š” ํƒ€์ž„์Šคํ…์„ ๋…ธ์ถœํ•ฉ๋‹ˆ๋‹ค.
- `step(...)` ํ•จ์ˆ˜๋Š” ์˜ˆ์ธก๋œ ๋ชจ๋ธ ์ถœ๋ ฅ๊ณผ "ํ˜„์žฌ" ์ƒ˜ํ”Œ(x_t)์„ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›๊ณ , "์ด์ „" ์•ฝ๊ฐ„ ๋” ๋…ธ์ด์ฆˆ๊ฐ€ ์ œ๊ฑฐ๋œ ์ƒ˜ํ”Œ(x_t-1)์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
- ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ ์Šค์ผ€์ค„๋Ÿฌ์˜ ๋ณต์žก์„ฑ์„ ๊ณ ๋ คํ•˜์—ฌ, `step` ํ•จ์ˆ˜๋Š” ๋ชจ๋“  ๋ณต์žก์„ฑ์„ ๋…ธ์ถœํ•˜์ง€ ์•Š์œผ๋ฉฐ, "๋ธ”๋ž™ ๋ฐ•์Šค"์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
- ๊ฑฐ์˜ ๋ชจ๋“  ๊ฒฝ์šฐ์— ์ƒˆ๋กœ์šด ์Šค์ผ€์ค„๋Ÿฌ๋Š” ์ƒˆ๋กœ์šด ์Šค์ผ€์ค„๋ง ํŒŒ์ผ์— ๊ตฌํ˜„๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.