k-ush/xlm-roberta-base-ance-en-jp-warmup

A XLM-RoBERTa-base model trained on mMARCO Japanese dataset with ANCE warmup script. Base checkpoint comes from k-ush/xlm-roberta-base-ance-warmup, so this model was trained both English and Japanese data. I upload checkpoint at 50k steps since MRR@100 at 60k checkpoint was decrease (mrr@100(rerank, full): 0.242, 0.182).

Dataset

I formmated Japanese mMarco dataset for ANCE. Dataset preparetion script is available on github. https://github.com/argonism/JANCE/blob/master/data/gen_jp_data.py

Evaluation Result

Evaluation Result during trainning with mMarco Japanese dev set.

Reranking/Full ranking mrr: 0.24208174148360342/0.19015224905626082
Downloads last month
105
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Dataset used to train k-ush/xlm-roberta-base-ance-en-jp-warmup