Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,58 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
datasets:
|
3 |
+
- Wild-Heart/Disney-VideoGeneration-Dataset
|
4 |
+
language:
|
5 |
+
- en
|
6 |
+
base_model:
|
7 |
+
- THUDM/CogVideoX-5b
|
8 |
+
pipeline_tag: text-to-video
|
9 |
+
library_name: diffusers
|
10 |
+
tags:
|
11 |
+
- text-to-video
|
12 |
+
- diffusers-training
|
13 |
+
- diffusers
|
14 |
+
- lora
|
15 |
+
- cogvideox
|
16 |
+
- cogvideox-diffusers
|
17 |
+
---
|
18 |
+
# CogVideoX LoRA Finetune
|
19 |
+
|
20 |
+
<Gallery />
|
21 |
+
|
22 |
+
## Model description
|
23 |
+
|
24 |
+
This is a lora finetune of the CogVideoX model `THUDM/CogVideoX-5b`.
|
25 |
+
|
26 |
+
The model was trained using [CogVideoX Factory](https://github.com/a-r-r-o-w/cogvideox-factory) - a repository containing memory-optimized training scripts for the CogVideoX family of models using [TorchAO](https://github.com/pytorch/ao) and [DeepSpeed](https://github.com/microsoft/DeepSpeed). The scripts were adopted from [CogVideoX Diffusers trainer](https://github.com/huggingface/diffusers/blob/main/examples/cogvideo/train_cogvideox_lora.py).
|
27 |
+
|
28 |
+
## Download model
|
29 |
+
|
30 |
+
[Download LoRA](https://huggingface.co/a-r-r-o-w/cogvideox-disney-adamw-4000-0.0003-constant/tree/main) in the Files & Versions tab.
|
31 |
+
|
32 |
+
## Usage
|
33 |
+
|
34 |
+
Requires the [🧨 Diffusers library](https://github.com/huggingface/diffusers) installed.
|
35 |
+
|
36 |
+
```py
|
37 |
+
import torch
|
38 |
+
from diffusers import CogVideoXPipeline
|
39 |
+
from diffusers import export_to_video
|
40 |
+
|
41 |
+
pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda")
|
42 |
+
pipe.load_lora_weights("a-r-r-o-w/cogvideox-disney-adamw-4000-0.0003-constant", weight_name="pytorch_lora_weights.safetensors", adapter_name=["cogvideox-lora"])
|
43 |
+
|
44 |
+
# The LoRA adapter weights are determined by what was used for training.
|
45 |
+
# In this case, we assume `--lora_alpha` is 32 and `--rank` is 64.
|
46 |
+
# It can be made lower or higher from what was used in training to decrease or amplify the effect
|
47 |
+
# of the LoRA upto a tolerance, beyond which one might notice no effect at all or overflows.
|
48 |
+
pipe.set_adapters(["cogvideox-lora"], [32 / 64])
|
49 |
+
|
50 |
+
video = pipe("BW_STYLE A black and white animated scene unfolds with an anthropomorphic goat surrounded by musical notes and symbols, suggesting a playful environment. Mickey Mouse appears, leaning forward in curiosity as the goat remains still. The goat then engages with Mickey, who bends down to converse or react. The dynamics shift as Mickey grabs the goat, potentially in surprise or playfulness, amidst a minimalistic background. The scene captures the evolving relationship between the two characters in a whimsical, animated setting, emphasizing their interactions and emotions", guidance_scale=6, use_dynamic_cfg=True).frames[0]
|
51 |
+
export_to_video(video, "output.mp4", fps=8)
|
52 |
+
```
|
53 |
+
|
54 |
+
For more details, including weighting, merging and fusing LoRAs, check the [documentation](https://huggingface.co/docs/diffusers/main/en/using-diffusers/loading_adapters) on loading LoRAs in diffusers.
|
55 |
+
|
56 |
+
## License
|
57 |
+
|
58 |
+
Please adhere to the licensing terms as described [here](https://huggingface.co/THUDM/CogVideoX-5b/blob/main/LICENSE) and [here](https://huggingface.co/THUDM/CogVideoX-2b/blob/main/LICENSE).
|