task vector optimization checkpoint ready for merging.

trained on MFANN for 12000 steps, however due to a slightly higher training loss, im going to merge this model with the last version and retrain, the goal was to use DARE-TIES to reduce the parameters used per vector, and this model will now be merged with the last model before DARE using TIES alone, and will be subsequently retrained.

Downloads last month
7
Safetensors
Model size
2.78B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for netcat420/MFANN3bv0.16.11

Merges
3 models

Dataset used to train netcat420/MFANN3bv0.16.11