- Tuned by
- https://huggingface.co/datasets/svjack/video-dataset-Lily-Bikini-organized
- To test Mochi-1 have ability to learn concept (object or person) in tiny dataset (trained on low resolution)
# Installtion
```bash
pip install git+https://github.com/huggingface/diffusers.git peft transformers torch sentencepiece opencv-python
```
# Example
## LandScape Example
```python
from diffusers import MochiPipeline
from diffusers.utils import export_to_video
import torch
pipe = MochiPipeline.from_pretrained("genmo/mochi-1-preview", torch_dtype = torch.float16)
pipe.load_lora_weights("svjack/mochi_Lily_Bikini_early_lora")
pipe.enable_model_cpu_offload()
pipe.enable_sequential_cpu_offload()
pipe.vae.enable_slicing()
pipe.vae.enable_tiling()
i = 50
generator = torch.Generator("cpu").manual_seed(i)
prompt = "Lily: The video features a woman with blonde hair wearing a black one-piece swimsuit. She is standing on a sandy beach with the ocean in the background, where waves are visible crashing onto the shore. The sky is clear with a few scattered clouds, suggesting a sunny day. The woman appears to be holding a piece of driftwood or a similar object in her right hand. Her stance and expression suggest she is posing for the camera."
pipeline_args = {
"prompt": prompt,
"num_inference_steps": 64,
"height": 480,
"width": 848,
"max_sequence_length": 1024,
"output_type": "np",
"num_frames": 19,
"generator": generator
}
video = pipe(**pipeline_args).frames[0]
export_to_video(video, "Lily_Lora.mp4")
from IPython import display
display.clear_output(wait = True)
display.Video("Lily_Lora.mp4")
```
- With lora
- With lora + Upscale