Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
grimjim 
posted an update May 28
Post
1686
I propose "merge densification", a style of merger which attempts to transfer the benefits of a denser model to a base model. The model weight in this case is 0.02, which is atypically small for mergers, but high compared to the learning rate used during training. In this case, the expectation is more creative text-generation. More details below:
grimjim/kunoichi-lemon-royale-v3-32K-7B
In this post