Text-to-Video
Diffusers
TuneAVideoPipeline
tune-a-video
jayw commited on
Commit
5efbcdd
1 Parent(s): 75b853c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -0
README.md CHANGED
@@ -1,3 +1,52 @@
1
  ---
2
  license: creativeml-openrail-m
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: creativeml-openrail-m
3
+ base_model: CompVis/stable-diffusion-v1-4
4
+ training_prompt: A man is surfing
5
+ tags:
6
+ - tune-a-video
7
+ - text-to-video
8
+ - diffusers
9
+ inference: false
10
  ---
11
+
12
+ # Tune-A-Video - Modern Disney
13
+
14
+ ## Model Description
15
+ - Base model: [nitrosocke/mo-di-diffusion](https://huggingface.co/nitrosocke/mo-di-diffusion)
16
+ - Training prompt: a bear is playing guitar.
17
+ ![sample-train](samples/train.gif)
18
+
19
+ ## Samples
20
+
21
+ ![sample-500](samples/sample-500.gif)
22
+ Test prompt: a [handsome prince/magical princess/rabbit/baby] is playing guitar, modern disney style.
23
+
24
+ ## Usage
25
+ Clone the github repo
26
+ ```bash
27
+ git clone https://github.com/showlab/Tune-A-Video.git
28
+ ```
29
+
30
+ Run inference code
31
+
32
+ ```python
33
+ from tuneavideo.pipelines.pipeline_tuneavideo import TuneAVideoPipeline
34
+ from tuneavideo.models.unet import UNet3DConditionModel
35
+ from tuneavideo.util import save_videos_grid
36
+ import torch
37
+
38
+ pretrained_model_path = "CompVis/stable-diffusion-v1-4"
39
+ unet_model_path = "Tune-A-Video-library/mo-di-bear-guitar"
40
+ unet = UNet3DConditionModel.from_pretrained(unet_model_path, subfolder='unet', torch_dtype=torch.float16).to('cuda')
41
+ pipe = TuneAVideoPipeline.from_pretrained(pretrained_model_path, unet=unet, torch_dtype=torch.float16).to("cuda")
42
+ pipe.enable_xformers_memory_efficient_attention()
43
+
44
+ prompt = "a magical princess is playing guitar, modern disney style"
45
+ video = pipe(prompt, video_length=8, height=512, width=512, num_inference_steps=50, guidance_scale=7.5).videos
46
+
47
+ save_videos_grid(video, f"./{prompt}.gif")
48
+ ```
49
+
50
+ ## Related Papers:
51
+ - [Tune-A-Video](https://arxiv.org/abs/2212.11565): One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
52
+ - [Stable Diffusion](https://arxiv.org/abs/2112.10752): High-Resolution Image Synthesis with Latent Diffusion Models