SpanMarker with numind/generic-entity_recognition_NER-multilingual-v1 on wikiann

This is a SpanMarker model trained on the wikiann dataset that can be used for Named Entity Recognition. This SpanMarker model uses numind/generic-entity_recognition_NER-multilingual-v1 as the underlying encoder.

Model Details

Model Description

Model Type: SpanMarker
Encoder: numind/generic-entity_recognition_NER-multilingual-v1
Maximum Sequence Length: 256 tokens
Maximum Entity Length: 9 words
Training Dataset: wikiann
Language: de
License: mit

Model Sources

Repository: SpanMarker on GitHub
Thesis: SpanMarker For Named Entity Recognition

Model Labels

Label	Examples
LOC	"Savoyer Voralpen", "Bagan", "Zechin"
ORG	"NHL Entry Draft", "SKA Sankt Petersburg", "Minnesota Wild"
PER	"Antonina Wladimirowna Kriwoschapka", "Lou Salomé", "Jaan Kirsipuu"

Evaluation

Metrics

Label	Precision	Recall	F1
all	0.9070	0.9070	0.9070
LOC	0.9036	0.9298	0.9165
ORG	0.8638	0.8446	0.8541
PER	0.9507	0.9405	0.9455

Uses

Direct Use for Inference

from span_marker import SpanMarkerModel

# Download from the 🤗 Hub
model = SpanMarkerModel.from_pretrained("span_marker_model_id")
# Run inference
entities = model.predict("Sein Bundesliga-Debüt gab der Angreifer am 23.")

Downstream Use

You can finetune this model on your own dataset.

Click to expand

from span_marker import SpanMarkerModel, Trainer

# Download from the 🤗 Hub
model = SpanMarkerModel.from_pretrained("span_marker_model_id")

# Specify a Dataset with "tokens" and "ner_tag" columns
dataset = load_dataset("conll2003") # For example CoNLL2003

# Initialize a Trainer using the pretrained model & dataset
trainer = Trainer(
    model=model,
    train_dataset=dataset["train"],
    eval_dataset=dataset["validation"],
)
trainer.train()
trainer.save_model("span_marker_model_id-finetuned")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Sentence length	1	9.7693	85
Entities per sentence	1	1.3821	20

Training Hyperparameters

learning_rate: 5e-05
train_batch_size: 64
eval_batch_size: 128
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10
mixed_precision_training: Native AMP

Training Results

Epoch	Step	Validation Loss	Validation Precision	Validation Recall	Validation F1	Validation Accuracy
1.2658	200	0.0172	0.8842	0.8534	0.8686	0.9586
2.5316	400	0.0145	0.8977	0.8889	0.8933	0.9670
3.7975	600	0.0161	0.8962	0.9006	0.8984	0.9688
5.0633	800	0.0180	0.8982	0.8996	0.8989	0.9689
6.3291	1000	0.0201	0.9014	0.9008	0.9011	0.9694
7.5949	1200	0.0201	0.9010	0.9057	0.9033	0.9702
8.8608	1400	0.0217	0.9062	0.9036	0.9049	0.9702

Framework Versions

Python: 3.10.12
SpanMarker: 1.5.0
Transformers: 4.35.2
PyTorch: 2.1.0+cu118
Datasets: 2.15.0
Tokenizers: 0.15.0

Citation

BibTeX

@software{Aarsen_SpanMarker,
    author = {Aarsen, Tom},
    license = {Apache-2.0},
    title = {{SpanMarker for Named Entity Recognition}},
    url = {https://github.com/tomaarsen/SpanMarkerNER}
}

Downloads last month: 5

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for davanstrien/numind_generic-entity_recognition_NER-multilingual-v1_wikiann_de

Base model

numind/NuNER-multilingual-v0.1

Finetuned

(3)

this model

Dataset used to train davanstrien/numind_generic-entity_recognition_NER-multilingual-v1_wikiann_de

Evaluation results

F1 on Unknown
self-reported

0.907
Precision on Unknown
self-reported

0.907
Recall on Unknown
self-reported

0.907

View on Papers With Code