ALBERT Tiny Spanish

This is an ALBERT model trained on a big spanish corpora. The model was trained on a single TPU v3-8 with the following hyperparameters and steps/time:

  • LR: 0.00125
  • Batch Size: 2048
  • Warmup ratio: 0.0125
  • Warmup steps: 125000
  • Goal steps: 10000000
  • Total steps: 8300000
  • Total training time (aprox): 58.2 days

Training loss

https://drive.google.com/uc?export=view&id=1KQc8yWZLKvDLjBtu4IOAgpTx0iLcvX_Q

Downloads last month
16
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Dataset used to train dccuchile/albert-tiny-spanish