Model
This model is based on nicoladecao/msmarco-word2vec256000-distilbert-base-uncased with a 256k sized vocabulary initialized with word2vec.
This model has been trained with MLM on the MS MARCO corpus collection for 445k steps. See train_mlm.py for the train script. It was run on 2x V100 GPUs.
Note: Token embeddings where updated!