Named Entity Recognition (NER) model for Portuguese

This is a NER model for Portuguese which uses the standard 'enamex' classes: LOC (geographical locations); PER (people); ORG (organizations); MISC (other entities).

The model is based on BERTimbau Large, which has been fine-tuned using a combination of available corpora (see [1] for details).

There is an alternative model trained using BERTimbau Base: bert-base-pt-ner-enamex.

It was trained with a batch size of 32 and a learning rate of 3e-5 during 3 epochs. It achieved the following results on the test set (Precision/Recall/F1): 0.919/0.925/0.922.

[1] Pablo Gamallo, Marcos Garcia & Patricia Martín-Rodilla, 2019. NER and open information extraction for Portuguese notebook for IberLEF 2019 Portuguese named entity recognition and relation extraction tasks. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019) co-located with 35th Conference of the Spanish Society for Natural Language Processing (SEPLN 2019): 457-467.

Downloads last month
14
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.