TrinityMaid-13b
I would love to see a merge of Noromaid-13b-0.4-DPO + WhiteRabbit-Trinity-13b.
WhiteRabbit's Trinity is surprisingly smart. However, it's based on CodeLlama (Llama-2), so I don't know if Llama-2-13b and Code Llama (Llama-2) will work due to their compatibility. π€
https://huggingface.co/WhiteRabbitNeo/Trinity-13B
Bump for maybe later
Thank you! I'm having trouble getting them to merge myself. The flatten versions behave normally, but when I merge them together, I get complete incoherent symbol rambling from there merge. Any insights would be most appreciated from you guys. ππ©
So I looked up and yep, like you said Trinity is based on CodeLlama, and it use a different rope theta value, that's why it don't work.
I will look later if there is a solution, but I don't think there is one right now sadly
Thanks for looking into it @Undi95
I know from experience when models aren't compatible and in my case don't know or try to do it anyway can lead to some weird results to say the least
Here is the fully power of my 70B model that I tried to create merging two Japanese LLMs
It is definitely a good bloated entropy generator I guess, maybe if I was Google you would see story about how the AI is evil and developing it's own language and how I needed that shut it down... how spooky ooooooooo but it's just brain damage beyond repair. I don't know if their is a pattern besides it's love for the letter Q and the word Question, definitely interesting case study perhaps