svjack's picture
Upload 1392 files
43b7e92 verified
|
raw
history blame
4.1 kB

์ปค์Šคํ…€ ํŒŒ์ดํ”„๋ผ์ธ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

[[open-in-colab]]

์ปค๋ฎค๋‹ˆํ‹ฐ ํŒŒ์ดํ”„๋ผ์ธ์€ ๋…ผ๋ฌธ์— ๋ช…์‹œ๋œ ์›๋ž˜์˜ ๊ตฌํ˜„์ฒด์™€ ๋‹ค๋ฅธ ํ˜•ํƒœ๋กœ ๊ตฌํ˜„๋œ ๋ชจ๋“  [DiffusionPipeline] ํด๋ž˜์Šค๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. (์˜ˆ๋ฅผ ๋“ค์–ด, [StableDiffusionControlNetPipeline]๋Š” "Text-to-Image Generation with ControlNet Conditioning" ํ•ด๋‹น) ์ด๋“ค์€ ์ถ”๊ฐ€ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•˜๊ฑฐ๋‚˜ ํŒŒ์ดํ”„๋ผ์ธ์˜ ์›๋ž˜ ๊ตฌํ˜„์„ ํ™•์žฅํ•ฉ๋‹ˆ๋‹ค.

Speech to Image ๋˜๋Š” Composable Stable Diffusion ๊ณผ ๊ฐ™์€ ๋ฉ‹์ง„ ์ปค๋ฎค๋‹ˆํ‹ฐ ํŒŒ์ดํ”„๋ผ์ธ์ด ๋งŽ์ด ์žˆ์œผ๋ฉฐ ์—ฌ๊ธฐ์—์„œ ๋ชจ๋“  ๊ณต์‹ ์ปค๋ฎค๋‹ˆํ‹ฐ ํŒŒ์ดํ”„๋ผ์ธ์„ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ—ˆ๋ธŒ์—์„œ ์ปค๋ฎค๋‹ˆํ‹ฐ ํŒŒ์ดํ”„๋ผ์ธ์„ ๋กœ๋“œํ•˜๋ ค๋ฉด, ์ปค๋ฎค๋‹ˆํ‹ฐ ํŒŒ์ดํ”„๋ผ์ธ์˜ ๋ฆฌํฌ์ง€ํ† ๋ฆฌ ID์™€ (ํŒŒ์ดํ”„๋ผ์ธ ๊ฐ€์ค‘์น˜ ๋ฐ ๊ตฌ์„ฑ ์š”์†Œ๋ฅผ ๋กœ๋“œํ•˜๋ ค๋Š”) ๋ชจ๋ธ์˜ ๋ฆฌํฌ์ง€ํ† ๋ฆฌ ID๋ฅผ ์ธ์ž๋กœ ์ „๋‹ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์•„๋ž˜ ์˜ˆ์‹œ์—์„œ๋Š” hf-internal-testing/diffusers-dummy-pipeline์—์„œ ๋”๋ฏธ ํŒŒ์ดํ”„๋ผ์ธ์„ ๋ถˆ๋Ÿฌ์˜ค๊ณ , google/ddpm-cifar10-32์—์„œ ํŒŒ์ดํ”„๋ผ์ธ์˜ ๊ฐ€์ค‘์น˜์™€ ์ปดํฌ๋„ŒํŠธ๋“ค์„ ๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ”’ ํ—ˆ๊น… ํŽ˜์ด์Šค ํ—ˆ๋ธŒ์—์„œ ์ปค๋ฎค๋‹ˆํ‹ฐ ํŒŒ์ดํ”„๋ผ์ธ์„ ๋ถˆ๋Ÿฌ์˜ค๋Š” ๊ฒƒ์€ ๊ณง ํ•ด๋‹น ์ฝ”๋“œ๊ฐ€ ์•ˆ์ „ํ•˜๋‹ค๊ณ  ์‹ ๋ขฐํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ฝ”๋“œ๋ฅผ ์ž๋™์œผ๋กœ ๋ถˆ๋Ÿฌ์˜ค๊ณ  ์‹คํ–‰ํ•˜๊ธฐ ์•ž์„œ ๋ฐ˜๋“œ์‹œ ์˜จ๋ผ์ธ์œผ๋กœ ํ•ด๋‹น ์ฝ”๋“œ์˜ ์‹ ๋ขฐ์„ฑ์„ ๊ฒ€์‚ฌํ•˜์„ธ์š”!

from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained(
    "google/ddpm-cifar10-32", custom_pipeline="hf-internal-testing/diffusers-dummy-pipeline"
)

๊ณต์‹ ์ปค๋ฎค๋‹ˆํ‹ฐ ํŒŒ์ดํ”„๋ผ์ธ์„ ๋ถˆ๋Ÿฌ์˜ค๋Š” ๊ฒƒ์€ ๋น„์Šทํ•˜์ง€๋งŒ, ๊ณต์‹ ๋ฆฌํฌ์ง€ํ† ๋ฆฌ ID์—์„œ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๋Š” ๊ฒƒ๊ณผ ๋”๋ถˆ์–ด ํ•ด๋‹น ํŒŒ์ดํ”„๋ผ์ธ ๋‚ด์˜ ์ปดํฌ๋„ŒํŠธ๋ฅผ ์ง์ ‘ ์ง€์ •ํ•˜๋Š” ๊ฒƒ ์—ญ์‹œ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜ ์˜ˆ์ œ๋ฅผ ๋ณด๋ฉด ์ปค๋ฎค๋‹ˆํ‹ฐ CLIP Guided Stable Diffusion ํŒŒ์ดํ”„๋ผ์ธ์„ ๋กœ๋“œํ•  ๋•Œ, ํ•ด๋‹น ํŒŒ์ดํ”„๋ผ์ธ์—์„œ ์‚ฌ์šฉํ•  clip_model ์ปดํฌ๋„ŒํŠธ์™€ feature_extractor ์ปดํฌ๋„ŒํŠธ๋ฅผ ์ง์ ‘ ์„ค์ •ํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

from diffusers import DiffusionPipeline
from transformers import CLIPImageProcessor, CLIPModel

clip_model_id = "laion/CLIP-ViT-B-32-laion2B-s34B-b79K"

feature_extractor = CLIPImageProcessor.from_pretrained(clip_model_id)
clip_model = CLIPModel.from_pretrained(clip_model_id)

pipeline = DiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    custom_pipeline="clip_guided_stable_diffusion",
    clip_model=clip_model,
    feature_extractor=feature_extractor,
)

์ปค๋ฎค๋‹ˆํ‹ฐ ํŒŒ์ดํ”„๋ผ์ธ์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์ปค๋ฎค๋‹ˆํ‹ฐ ํŒŒ์ดํ”„๋ผ์ธ ๊ฐ€์ด๋“œ๋ฅผ ์‚ดํŽด๋ณด์„ธ์š”. ์ปค๋ฎค๋‹ˆํ‹ฐ ํŒŒ์ดํ”„๋ผ์ธ ๋“ฑ๋ก์— ๊ด€์‹ฌ์ด ์žˆ๋Š” ๊ฒฝ์šฐ ์ปค๋ฎค๋‹ˆํ‹ฐ ํŒŒ์ดํ”„๋ผ์ธ์— ๊ธฐ์—ฌํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ๊ฐ€์ด๋“œ๋ฅผ ํ™•์ธํ•˜์„ธ์š” !