Some recommendations.
Hello there, I really like your models and the open source nature of your models. I would like to give some recommendations for future a future model.
Training tricks that save a whole lot of money:
https://arxiv.org/abs/2501.04765
https://arxiv.org/abs/2506.09229
https://arxiv.org/abs/2510.12581
https://arxiv.org/abs/2504.10483
Some RL/SFT stuff:
https://arxiv.org/abs/2505.07818
https://arxiv.org/abs/2511.15605
SOTA video models that have detailed training papers:
https://arxiv.org/abs/2601.04151 (26B audio-video model, apparantly Veo3 level)
https://arxiv.org/abs/2508.15761 (12B video model that was #3 on arena at one point, very detailed paper)
https://arxiv.org/abs/2511.18870 (8B video model that surpasses Wan2.2, open source)
https://arxiv.org/abs/2511.14993 (20B open source video model, #1 open model on arena, bad prompt adherence though)
Better VAE:
https://huggingface.co/kandinskylab/KVAE-3D-1.0 (though a high compression VAE will be much much faster like LTX 2)
Hopefully these papers help with the next model(s). Very excited to see what comes next from you guys :) (hopefully a 24B audio video model? π)
Appreciate the interest and the thoughtful links!