โ€
โ€โ€
โ€โ€โ€โ€Model: DeBERTa
โ€โ€โ€โ€Lang: IT
โ€โ€
โ€

Model description

This is a DeBERTa [1] model for the Italian language, obtained using mDeBERTa (mdeberta-v3-base) as a starting point and focusing it on the Italian language by modifying the embedding layer (as in [2], computing document-level frequencies over the Wikipedia dataset)

The resulting model has 124M parameters, a vocabulary of 50.256 tokens, and a size of ~500 MB.

Quick usage

from transformers import DebertaV2TokenizerFast, DebertaV2Model

tokenizer = DebertaV2TokenizerFast.from_pretrained("osiria/deberta-base-italian")
model = DebertaV2Model.from_pretrained("osiria/deberta-base-italian")

References

[1] https://arxiv.org/abs/2111.09543

[2] https://arxiv.org/abs/2010.05609

License

The model is released under MIT license

Downloads last month
1,077
Safetensors
Model size
124M params
Tensor type
I64
ยท
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including osiria/deberta-base-italian