- Tuned by - https://huggingface.co/datasets/svjack/video-dataset-Lily-Bikini-organized - To test Mochi-1 have ability to learn concept (object or person) in tiny dataset (trained on low resolution) # Installtion ```bash pip install git+https://github.com/huggingface/diffusers.git peft transformers torch sentencepiece opencv-python ``` # Example ## LandScape Example ```python from diffusers import MochiPipeline from diffusers.utils import export_to_video import torch pipe = MochiPipeline.from_pretrained("genmo/mochi-1-preview", torch_dtype = torch.float16) pipe.load_lora_weights("svjack/mochi_Lily_Bikini_early_lora") pipe.enable_model_cpu_offload() pipe.enable_sequential_cpu_offload() pipe.vae.enable_slicing() pipe.vae.enable_tiling() i = 50 generator = torch.Generator("cpu").manual_seed(i) prompt = "Lily: The video features a woman with blonde hair wearing a black one-piece swimsuit. She is standing on a sandy beach with the ocean in the background, where waves are visible crashing onto the shore. The sky is clear with a few scattered clouds, suggesting a sunny day. The woman appears to be holding a piece of driftwood or a similar object in her right hand. Her stance and expression suggest she is posing for the camera." pipeline_args = { "prompt": prompt, "num_inference_steps": 64, "height": 480, "width": 848, "max_sequence_length": 1024, "output_type": "np", "num_frames": 19, "generator": generator } video = pipe(**pipeline_args).frames[0] export_to_video(video, "Lily_Lora.mp4") from IPython import display display.clear_output(wait = True) display.Video("Lily_Lora.mp4") ``` - With lora - With lora + Upscale