Some recommendations.

by natalie5 - opened 27 days ago

•

Hello there, I really like your models and the open source nature of your models. I would like to give some recommendations for future a future model.

Training tricks that save a whole lot of money:
https://arxiv.org/abs/2501.04765
https://arxiv.org/abs/2506.09229
https://arxiv.org/abs/2510.12581
https://arxiv.org/abs/2504.10483

Some RL/SFT stuff:
https://arxiv.org/abs/2505.07818
https://arxiv.org/abs/2511.15605

SOTA video models that have detailed training papers:
https://arxiv.org/abs/2601.04151 (26B audio-video model, apparantly Veo3 level)
https://arxiv.org/abs/2508.15761 (12B video model that was #3 on arena at one point, very detailed paper)
https://arxiv.org/abs/2511.18870 (8B video model that surpasses Wan2.2, open source)
https://arxiv.org/abs/2511.14993 (20B open source video model, #1 open model on arena, bad prompt adherence though)

Better VAE:
https://huggingface.co/kandinskylab/KVAE-3D-1.0 (though a high compression VAE will be much much faster like LTX 2)

Hopefully these papers help with the next model(s). Very excited to see what comes next from you guys :) (hopefully a 24B audio video model? 🙏)

schopra

Linum AI org 26 days ago

Appreciate the interest and the thoughtful links!

schopra changed discussion status to closed 26 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment