Linum v2 (2B, text-to-video)
Collection
360p or 720p, 2-5 seconds, Apache 2.0
β’
2 items
β’
Updated
β’
8
Small text-to-video generation model trained from scratch by Linum AI. Lower VRAM requirements than the 720p variant. Read the launch blog post.
Linum V2 is a 2B parameter Diffusion Transformer (DiT) based text-to-video model that generates 360p (640x360) videos at 24 FPS from text prompts.
| Property | Value |
|---|---|
| Resolution | 640x360 (360p) |
| Frame Rate | 24 FPS |
| Duration | 2-5 seconds |
| Parameters | 2B |
| Architecture | DiT + T5-XXL + WAN 2.1 VAE |
See the full documentation at: GitHub - Linum-AI/linum-v2
First, install uv:
curl -LsSf https://astral.sh/uv/install.sh | sh
Then clone and generate your first video:
git clone https://github.com/Linum-AI/linum-v2.git
cd linum-v2
uv sync
uv run python generate_video.py \
--prompt "A cute 3D animated baby goat with shaggy gray fur, a fluffy white chin tuft, and stubby curved horns perches on a round wooden stool. Warm golden studio lights bounce off its glossy cherry-red acoustic guitar as it rhythmically strums with a confident hoof, hind legs dangling. Framed family portraits of other barnyard animals line the cream-colored walls, a leafy potted ficus sits in the back corner, and dust motes drift through the cozy, sun-speckled room." \
--output goat.mp4 \
--seed 16 \
--cfg 10.0 \
--resolution 360p
Weights are downloaded automatically on first run (~20GB).
For higher quality, use the 720p model (requires more VRAM).
βββ dit/
β βββ 360p.safetensors # DiT model weights
βββ vae/
β βββ vae.safetensors # WAN 2.1 Video VAE
βββ t5/
βββ text_encoder/ # T5-XXL encoder
βββ tokenizer/ # T5 tokenizer
@software{linum_v2_2026,
title = {Linum V2: Text-to-Video Generation},
author = {Linum AI},
year = {2026},
url = {https://github.com/Linum-AI/linum-v2}
}