bert-mini-amharic-16k
This model has the same architecture as bert-mini and was pretrained from scratch using the Amharic subsets of the oscar and mc4 datasets, on a total of 165 Million
tokens.
It achieves the following results on the evaluation set:
Loss: 2.59
Perplexity: 13.33
Even though this model only has 7.5 Million
parameters, its perplexity score is comparable to the 36x larger 279 Million
parameter xlm-roberta-base model on the same Amharic evaluation set.
- Downloads last month
- 14
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.