task vector optimization checkpoint ready for merging.

trained on MFANN for 12000 steps, however due to a slightly higher training loss, im going to merge this model with the last version and retrain, the goal was to use DARE-TIES to reduce the parameters used per vector, and this model will now be merged with the last model before DARE using TIES alone, and will be subsequently retrained.

Downloads last month: 7

Safetensors

Model size

2.78B params

Tensor type

F32

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

Model tree for netcat420/MFANN3bv0.16.11

Merges

3 models

netcat420
/

MFANN3bv0.16.11

Model tree for netcat420/MFANN3bv0.16.11

Dataset used to train netcat420/MFANN3bv0.16.11