# Model This model is based on [nicoladecao/msmarco-word2vec256000-distilbert-base-uncased](https://huggingface.co/nicoladecao/msmarco-word2vec256000-distilbert-base-uncased) with a 256k sized vocabulary initialized with word2vec. This model has been trained with MLM on the MS MARCO corpus collection for 445k steps. See train_mlm.py for the train script. It was run on 2x V100 GPUs. **Note: Token embeddings where updated!**