Edit model card

Arconte-13B

Arconte is Llama-2 merge. Arconte has many iterations, trying different recipes/models/merge-methods, in particular, iteration I and iteration Z are both models which showed promise. This version of Arconte is variation I redone with a more experimental approach to the merge recipe, and it shows great results.

Originally, the idea was to do one of those fancy Dare Ties model A, Dare Ties model B, Slerp model A + model B. I already did the Slerp model C, but it's flawed due to the flawed iterations I and Z. I still plan to do model C, so now I am remaking iteration Z.

Models used:

NeverSleep/X-NoroChronos-13B

Undi95/Emerhyst-13B

Henk717/echidna-tiefigther-25

After completing model C, current roadmap is to either go into mistral merges, or trying my hand at making loras/qloras. No mixtral, nor anything above 13B parameters in the future due to hardware limitations.

All testing was done with Q5_K_M GUFF. I'll upload the full GUFF range along with an Imatrix version soon.

Update 3/30/24

I have tested this model further and I concluded that I find it boring. I remember I greenlighted this model because it was coherent (as much as a Q5_K_M can be), but now I think it's just not that good. But perhaps it is just my taste in models? or maybe my sampling settings are bad? I would like some feedback to know how good or bad this model is. I still plan to cook that C model, but I don't know if I will use this one to do it.

I will be releasing another model soon, an older model that I think is better than this one.

Downloads last month
14
Safetensors
Model size
13B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.