Questions on Training and Architecture

#2
by crosant13 - opened

I’m exploring this model, particularly its training methods and architectural specifics,
and I have a few questions:

Is the 2nd stage missing in the descriptions of "Training and Fine-tuning process" or is it a typo ?
How architecturally distinct is this model from BGE3, and are there practical differences in its embedding approach?
What evaluation metrics did you use during training, and are any benchmarks available for comparison?
Could you share more about the fine-tuning capabilities—especially regarding generating custom embeddings or using the model in domain-specific applications?
Also can you share the training code or give us idea on how you exactly did that ?
Thank you in advance for any insights!

Sign up or log in to comment