cognitivecomputations/dolphin-2.9.4-llama3.1-8b · Fine-tuning Minitron-4B-Base?

Recently, Nvidia published this 4b distrilled model with some evals equal to the 8b.

https://huggingface.co/nvidia/Llama-3.1-Minitron-4B-Width-Base

If you guys fine-tune that to a Dolphin chat model, it may become one of the first fine-tune of this kind of performance level within 4b size category.

There is this model also: https://huggingface.co/solidrust/Llama-3.1-Minitron-4B-Magpie-SFT-800K-MT-Magpo-3.1-Pro-05-AWQ

Only concern if there are some architecture changes. I think I saw a conversion of Nemo to Llama architecture for compatibility somewhere.