A small version of DeBERTa trained on the clean version of google C4 dataset. For more info about the size of the model, see config.json.

The model has been trained for 100K steps with a batch size of 2048 and a sequence length of 512, for a total of 104B tokens.

The vocabulary and the tokenizer are the same as microsoft/deberta-base.

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

lucadiliello
/

deberta-small

Dataset used to train lucadiliello/deberta-small