@mlabonne on Hugging Face: "🌳 Model Family Tree Merging models has become a powerful way to compress…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

mlabonne

posted an update Jan 20

Post

🌳 Model Family Tree

Merging models has become a powerful way to compress information and build powerful models for cheap. Right now, the process is still quite experimental: which models to merge? which parameters should I use? We have some intuition but no principled approach.

I made a little tool to make things a little clearer. It allows you to visualize the family tree of any model on the Hub. It also displays the type of license they use: permissive (green), noncommercial (red), and unknown (gray). It should help people select the right license based on the parent models.

In addition, I hope it can be refined to extract more information about these models: do models from very different branches work better when merged? Can we select them based on the weight difference? There are a lot of questions to explore in this new space. :)

Here's a link to the colab notebook I made: https://colab.research.google.com/drive/1s2eQlolcI1VGgDhqWIANfkfKvcKrMyNr
If you want to know more about model merging or build you own merges, here's the article I wrote about this topic: https://huggingface.co/blog/mlabonne/merge-models

MaziyarPanahi

Jan 20

Thanks for sharing the Colab notebook, it's incredibly useful and easy to use!

osanseviero

Jan 20

This is super cool! I tried it today with https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.1 and ended with super cool results. I see quite a bit of base_model metadata comes from mergekit configs. Once that's added to the model card metadata it will be quite easy to explore all these relationships!

mrfakename

Jan 20

This comment has been hidden

mrfakename

Jan 21

Here's a HF Space for easier usage:

https://huggingface.co/spaces/mrfakename/merge-model-tree

Epiculous

Jan 23

Fantastic work! Thank you for putting this together!

CultriX

Jan 23

Awesome tool!

leonardlin

Feb 18

BTW, I was trying to get a tree on https://huggingface.co/mlabonne/AlphaMonarch-7B and it was getting caught in a recursion loop. I started first by adding caching on the ModelCard assuming it'd figure things out but it didn't and I hacked in some stuff preventing revisits (also added some weak handling for missing models since that was looping as well since AIDC-ai-business/Marcoroni-7B-v3 for example has disappeared).

Anyway, my updated code still has broken chart rendering (cyclic graph - what was causing the looping issues) but at least it will get a list of the model lineage, which was good enough for my purposes... In case anyone wants to move this forward or needs a reference in case they run into looping issues: https://colab.research.google.com/drive/1-7w_pPWPCCQQpQ7LrvlKIdhyHsoCHH4E?usp=sharing

mlabonne

Feb 18

Thanks a lot @leonardlin ! I fixed the issue you raised and updated the original notebook with your changes: https://colab.research.google.com/drive/1s2eQlolcI1VGgDhqWIANfkfKvcKrMyNr

In this post