YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Classifying Text into NACE Codes
This model is xlm-roberta-base fine-tuned to classify descriptions of activities into NACE Rev. 2 codes.
Data
The data used to fine-tune the model consist of 2.5 million descriptions of activities from Norwegian and Danish businesses. To improve the model's multilingual performance, random samples of the Norwegian and Danish descriptions were machine translated into the following languages:
- English
- German
- Spanish
- French
- Finnish
- Polish
Quick Start
from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("erst/xlm-roberta-base-finetuned-nace")
model = AutoModelForSequenceClassification.from_pretrained("erst/xlm-roberta-base-finetuned-nace")
pl = pipeline(
"sentiment-analysis",
model=model,
tokenizer=tokenizer,
return_all_scores=False,
)
pl("The purpose of our company is to build houses")
License
This model is released under the MIT License
- Downloads last month
- 94
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.