Details I am still rigorously testing different hyperparameters and comparing impact of each one to find the best workflow So far done 16 different full trainings and completing 8 more at the moment I am using my poor overfit 15 images dataset for experimentation (4th image) I have already proven that when I use a better dataset it becomes many times betters and generate expressions perfectly Here example case : https://www.reddit.com/r/FluxAI/comments/1ffz9uc/tried_expressions_with_flux_lora_training_with_my/ Conclusions When the results are analyzed, Fine Tuning is way lesser overfit and more generalized and better quality In first 2 images, it is able to change hair color and add beard much better, means lesser overfit In the third image, you will notice that the armor is much better, thus lesser overfit I noticed that the environment and clothings are much lesser overfit and better quality Disadvantages Kohya still doesn’t have FP8 training, thus 24 GB GPUs gets a huge speed drop Moreover, 48 GB GPUs has to use Fused Back Pass optimization, thus have some speed drop 16 GB GPUs gets way more aggressive speed drop due to lack of FP8 Clip-L and T5 trainings still not supported Speeds Rank 1 Fast Config — uses 27.5 GB VRAM, 6.28 second / it (LoRA is 4.85 second / it) Rank 1 Slower Config — uses 23.1 GB VRAM, 14.12 second / it (LoRA is 4.85 second / it) Rank 1 Slowest Config — uses 15.5 GB VRAM, 39 second / it (LoRA is 6.05 second / it) Final Info Saved checkpoints are FP16 and thus 23.8 GB (no Clip-L or T5 trained) According to the Kohya, applied optimizations doesn’t change quality so all configs are ranked as Rank 1 at the moment I am still testing whether these optimizations make any impact on quality or not
I am delighted to announce the publication of my LegalKit, a French labeled dataset built for legal ML training 🤗
This dataset comprises multiple query-document pairs (+50k) curated for training sentence embedding models within the domain of French law.
The labeling process follows a systematic approach to ensure consistency and relevance: - Initial Query Generation: Three instances of the LLaMA-3-70B model independently generate three different queries based on the same document. - Selection of Optimal Query: A fourth instance of the LLaMA-3-70B model, using a dedicated selection prompt, evaluates the generated queries and selects the most suitable one. - Final Label Assignment: The chosen query is used to label the document, aiming to ensure that the label accurately reflects the content and context of the original text.