File size: 2,219 Bytes
ef3fb99 3e78ea6 c995647 ef3fb99 2858476 4a7ac9f ef3fb99 41edf72 ef3fb99 41edf72 ef3fb99 59f7a0f 71d2e16 3ed5c37 63b873b cbad473 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
---
license: apache-2.0
datasets:
- TempoFunk/webvid-10M
language:
- en
tags:
- text-to-video
base_model:
- ali-vilab/text-to-video-ms-1.7b
---
# caT text to video
Conditionally augmented text-to-video model. Uses pre-trained weights from modelscope text-to-video model, augmented with temporal conditioning transformers to extend generated clips and create a smooth transition between them.
Supports prompt interpolation as well to change scenes during clip extensions.
The model was trained on two RTX 6000 Ada GPUs for 5 million steps using the WebWid 10M dataset, with a batch size of 1 and a learning rate of 1e-6 at a resolution of 320x320. It used 8 frames for conditioning and 8 frames for noisy samples, with a stride of 6.
## Installation
### Clone the Repository
```bash
git clone https://github.com/motexture/caT-text-to-video-2.3b/
cd caT-text-to-video-2.3b
python3 -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
pip install -r requirements.txt
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
python run.py
```
Visit the provided URL in your browser to interact with the interface and start generating videos.
Examples:
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/64a86f7d03835e13f95c3687/OPFi_f4bp2WuCDSYodHJE.mp4"></video>
A guy is riding a bike -> A guy is riding a motorcycle
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/64a86f7d03835e13f95c3687/U0Jx7U-Oo4lBgFJoB7E0v.mp4"></video>
Will Smith is eating a hamburger -> Will Smith is eating an ice cream
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/64a86f7d03835e13f95c3687/hZprbX6TTpJxWyMDMJIrl.mp4"></video>
A lion is looking around -> A lion is running
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/64a86f7d03835e13f95c3687/FrGfwcXRU7FyM9aMAyu3x.mp4"></video>
Darth Vader is surfing on the ocean
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/64a86f7d03835e13f95c3687/VoUg8tnsZqnn1QsXz93Xh.mp4"></video>
A beautiful anime girl with pink hair -> Anime girl laughing |