Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Hugging Face's logo

language: am datasets:


bert-base-multilingual-cased-finetuned-amharic

Model description

bert-base-multilingual-cased-finetuned-amharic is a Amharic BERT model obtained by replacing mBERT vocabulary by amharic vocabulary because the language was not supported, and fine-tuning bert-base-multilingual-cased model on Amharic language texts. It provides better performance than the multilingual Amharic on named entity recognition datasets.

Specifically, this model is a bert-base-multilingual-cased model that was fine-tuned on Amharic corpus using Amharic vocabulary.

Intended uses & limitations

How to use

You can use this model with Transformers pipeline for masked token prediction.

>>> from transformers import pipeline
>>> unmasker = pipeline('fill-mask', model='Davlan/bert-base-multilingual-cased-finetuned-amharic')
>>> unmasker("α‹¨αŠ αˆœαˆͺካ α‹¨αŠ ααˆͺካ α‰€αŠ•α‹΅ αˆα‹© αˆ˜αˆα‹•αŠ­α‰°αŠ› αŒ„αˆαˆͺ αŒαˆα‰΅αˆ›αŠ• α‰ αŠ αˆ«α‰΅ αŠ αŒˆαˆ«α‰΅ α‹¨αˆšα‹«α‹°αŒ‰α‰΅αŠ• [MASK] αˆ˜αŒ€αˆ˜αˆ«α‰Έα‹αŠ• α‹¨αŠ αˆœαˆͺካ የውαŒͺ αŒ‰α‹³α‹­ αˆšαŠ•αˆ΅α‰΄αˆ­ αŠ αˆ΅α‰³α‹ˆα‰€α’")
                    

Limitations and bias

This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains.

Training data

This model was fine-tuned on Amharic CC-100

Training procedure

This model was trained on a single NVIDIA V100 GPU

Eval results on Test set (F-score, average over 5 runs)

Dataset mBERT F1 am_bert F1
MasakhaNER 0.0 60.89

BibTeX entry and citation info

By David Adelani


Downloads last month
400
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Davlan/bert-base-multilingual-cased-finetuned-amharic

Adapters
2 models

Space using Davlan/bert-base-multilingual-cased-finetuned-amharic 1