alvdansen (araminta

posted an update 4 months ago

Post

1584

Releasing Flimmer today — a video LoRA training toolkit for WAN 2.1 and 2.2 that covers the full pipeline from raw footage to trained checkpoint.
The standout feature is phased training: multi-stage runs where each phase has its own learning rate, epochs, and dataset, with the checkpoint carrying forward automatically. Built specifically with WAN 2.2's dual-expert MoE architecture in mind.

Data prep tools are standalone and output standard formats — they work with any trainer, not just Flimmer.

Early release, building in the open. LTX support coming next.

http://github.com/alvdansen/flimmer-trainer

replied to their post 4 months ago

Thank you!!

posted an update 4 months ago

Post

1877

Just open-sourced LoRA Gym with Timothy - production-ready training pipeline for character, motion, aesthetic, and style LoRAs on Wan 2.1/2.2, built on musubi-tuner.

16 training templates across Modal (serverless) and RunPod (bare metal) covering T2V, I2V, Lightning-merged, and vanilla variants.

Our current experimentation focus is Wan 2.2, which is why we built on musubi-tuner (kohya-ss). Wan 2.2's DiT uses a Mixture-of-Experts architecture with two separate experts gated by a hard timestep switch - you're training two LoRAs per concept, one for high-noise (composition/motion) and one for low-noise (texture/identity), and loading both at inference. Musubi handles this dual-expert training natively, and our templates build on top of it to manage the correct timestep boundaries, precision settings, and flow shift values so you don't have to debug those yourself. We've also documented bug fixes for undocumented issues in musubi-tuner and validated hyperparameter defaults derived from cross-referencing multiple practitioners' results rather than untested community defaults.

Also releasing our auto-captioning toolkit for the first time. Per-LoRA-type captioning strategies for characters, styles, motion, and objects. Gemini (free) or Replicate backends.

Current hyperparameters reflect consolidated community findings. We've started our own refinement and plan to release specific recommendations and methodology as soon as next week.

Repo: github.com/alvdansen/lora-gym

2 replies

·

posted an update almost 2 years ago

Post

5278

📸Photo LoRA Drop📸

I've been working on this one for a few days, but really I've had this dataset for a few years! I collected a bunch of open access photos online back in late 2022, but I was never happy enough with how they played with the base model!

I am so thrilled that they look so nice with Flux!

This for me is a version one of this model - I still see room for improvement and possibly expansion of it's 40 image dataset. For those who are curious:

40 Image
3200 Steps
Dim 32
3e-4

Enjoy! Create! Big thank you to Glif for sponsoring the model creation! :D

alvdansen/flux_film_foto

posted an update almost 2 years ago

Post

7343

Alright Ya'll

I know it's a Saturday, but I decided to release my first Flux Dev Lora.

A retrain of my "Frosting Lane" model and I am sure the styles will just keep improving.

Have fun! Link Below - Thanks again to @ostris for the trainer and Black Forest Labs for the awesome model!

alvdansen/frosting_lane_flux

posted an update almost 2 years ago

Post

4688

New model drop...🥁

FROSTING LANE REDUX

The v1 of this model was released during a big model push, so I think it got lost in the shuffle. I revisited it for a project and realized it wasn't inventive enough around certain concepts, so I decided to retrain.

alvdansen/frosting_lane_redux

I think the original model was really strong on it's own, but because it was trained on fewer images I found that it was producing a very lackluster range of facial expressions, so I wanted to improve that.

The hardest part of creating models like this, I find, is maintaining the detailed linework without without overfitting. It takes a really balanced dataset and I repeat the data 12 times during the process, stopping at the last 10-20 epochs.

It is very difficult to predict the exact amount of time needed, so for me it is crucial to do epoch stops. Every model has a different threshold for ideal success.

replied to victor's post almost 2 years ago

Yeah that would be really helpful, I haven't had the time to try and do something like that.

replied to their post almost 2 years ago

Thanks!

posted an update almost 2 years ago

Post

3029

I really like what the @jasperAITeam designed with Flash LoRA. It works really well for something that generates so quickly, and I'm excited to test it out with Animate Diff, because I recently was testing LCM on it's own for AD and the results were already promising.

I put together my own page of models using their code and LoRA. Enjoy!

alvdansen/flash-lora-araminta-k-styles

replied to their post almost 2 years ago

No problem! Hope it helps!

posted an update almost 2 years ago

Post

3200

**How I train a LoRA: m3lt style training overview**

I've just written an article that takes a step by step approach to outlining the method that I used to train the 'm3lt' lora, a blended style model.

I've used the LoRA Ease trainer by @multimodalart :D

https://huggingface.co/blog/alvdansen/training-lora-m3lt
multimodalart/lora-ease

5 replies

·

replied to their post almost 2 years ago

I responded on X with the best way to contact me.

reacted to their post with 🔥 almost 2 years ago

Post

5930

New LoRA Model!

I trained this model on a new spot I'm really excited to share (soon!)

This Monday I will be posting my first beginning to end blog showing the tool I've used, dataset, captioning techniques, and parameters to finetune this LoRA.

For now, check out the model in the link below.

alvdansen/m3lt

5 replies

·

replied to their post almost 2 years ago

I don’t know who you are

reacted to MonsterMMORPG's post with 🔥 almost 2 years ago

Post

1321

How to Use SwarmUI & Stable Diffusion 3 on Cloud Services Kaggle (free), Massed Compute & RunPod : https://youtu.be/XFUZof6Skkw

Tutorial link : https://youtu.be/XFUZof6Skkw

It has manually written captions / subtitles and also video chapters.

If you are a GPU poor this is the video you need

In this video, I demonstrate how to install and use #SwarmUI on cloud services. If you lack a powerful GPU or wish to harness more GPU power, this video is essential. You'll learn how to install and utilize SwarmUI, one of the most powerful Generative AI interfaces, on Massed Compute, RunPod, and Kaggle (which offers free dual T4 GPU access for 30 hours weekly). This tutorial will enable you to use SwarmUI on cloud GPU providers as easily and efficiently as on your local PC. Moreover, I will show how to use Stable Diffusion 3 (#SD3) on cloud. SwarmUI uses #ComfyUI backend.

🔗 The Public Post (no login or account required) Shown In The Video With The Links ➡️ https://www.patreon.com/posts/stableswarmui-3-106135985

🔗 Windows Tutorial for Learn How to Use SwarmUI ➡️ https://youtu.be/HKX8_F1Er_w

🔗 How to download models very fast to Massed Compute, RunPod and Kaggle and how to upload models or files to Hugging Face very fast tutorial ➡️ https://youtu.be/X5WVZ0NMaTg

🔗 SECourses Discord ➡️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

🔗 Stable Diffusion GitHub Repo (Please Star, Fork and Watch) ➡️ https://github.com/FurkanGozukara/Stable-Diffusion

Coupon Code for Massed Compute : SECourses
Coupon works on Alt Config RTX A6000 and also RTX A6000 GPUs

posted an update almost 2 years ago

Post

5930

New LoRA Model!

I trained this model on a new spot I'm really excited to share (soon!)

This Monday I will be posting my first beginning to end blog showing the tool I've used, dataset, captioning techniques, and parameters to finetune this LoRA.

For now, check out the model in the link below.

alvdansen/m3lt

5 replies

·

replied to louisbrulenaudet's post almost 2 years ago

Congrats :D

posted an update almost 2 years ago

Post

2617

Per popular request, I'm working on a beginning to end LoRA training workflow blog for a style.

It will focus on dataset curation through training on a pre-determined style to give a better insight on my process.

Curious what are some questions you might have that I can try to answer in it?

posted an update almost 2 years ago

Post

2488

A few new styles added as SDXL LoRA:

Midsommar Cartoon
A playful cartoon style featuring bold colors and a retro aesthetic. Personal favorite at the moment.
alvdansen/midsommarcartoon
---
Wood Block XL
I've started training public domain styles to create some interesting datasets. In this case I found a group of images taken from really beautiful and colorful Japanese Blockprints.
alvdansen/wood-block-xl
--
Dimension W
For this model I did actually end up working on an SD 1.5 model as well as an SDXL. I prefer the SDXL version, and I am still looking for parameters I am really happy with for SD 1.5. That said, both have their merits. I trained this with the short film I am working on in mind.
alvdansen/dimension-w
alvdansen/dimension-w-sd15

replied to their post almost 2 years ago

I typically use Kohya, but I also test a lot of platform services for the right one because I am a creature of comfort :)

araminta_k PRO

AI & ML interests

Recent Activity

Organizations

araminta_k PRO

AI & ML interests

Recent Activity

Organizations

alvdansen's activity