This is a RoBERTa-base model trained from scratch in Spanish.

The training dataset is mc4 subsampling documents to a total of about 50 million examples. Sampling is biased towards average perplexity values (using a Gaussian function), discarding more often documents with very large values (poor quality) of very small values (short, repetitive texts).

This model has been trained for 250.000 steps.

Please see our main card for more information.

This is part of the Flax/Jax Community Week, organised by HuggingFace and TPU usage sponsored by Google.

Team members

Eduardo González (edugp)
Javier de la Rosa (versae)
Manu Romero (mrm8488)
María Grandury (mariagrandury)
Pablo González de Prado (Pablogps)
Paulo Villegas (paulo)

Downloads last month: 4

Safetensors

Model size

0.1B params

Tensor type

I64

F32

bertin-project
/

bertin-base-gaussian

Team members

Space using bertin-project/bertin-base-gaussian 1