Flair NER model trained on GermEval14 dataset

This model was trained on the official GermEval14 dataset using the Flair framework.

It uses a fine-tuned German DistilBERT model from here.

Results

Dataset \ Run Run 1 Run 2 Run 3† Run 4 Run 5 Avg.
Development 87.05 86.52 87.34 86.85 86.46 86.84
Test 85.43 85.88 85.72 85.47 85.62 85.62

† denotes that this model is selected for upload.

Flair Fine-Tuning

We used the following script to fine-tune the model on the GermEval14 dataset:

from argparse import ArgumentParser
import torch, flair              

# dataset, model and embedding imports
from flair.datasets import GERMEVAL_14
from flair.embeddings import TransformerWordEmbeddings
from flair.models import SequenceTagger    
from flair.trainers import ModelTrainer

if __name__ == "__main__":

    # All arguments that can be passed
    parser = ArgumentParser()
    parser.add_argument("-s", "--seeds", nargs='+', type=int, default='42')  # pass list of seeds for experiments
    parser.add_argument("-c", "--cuda", type=int, default=0, help="CUDA device")  # which cuda device to use
    parser.add_argument("-m", "--model", type=str, help="Model name (such as Hugging Face model hub name")

    # Parse experimental arguments
    args = parser.parse_args()

    # use cuda device as passed
    flair.device = f'cuda:{str(args.cuda)}'

    # for each passed seed, do one experimental run
    for seed in args.seeds:
        flair.set_seed(seed)

        # model
        hf_model = args.model

        # initialize embeddings
        embeddings = TransformerWordEmbeddings(
            model=hf_model,
            layers="-1",
            subtoken_pooling="first",
            fine_tune=True,
            use_context=False,
            respect_document_boundaries=False,
        )

        # select dataset depending on which language variable is passed
        corpus = GERMEVAL_14()

        # make the dictionary of tags to predict
        tag_dictionary = corpus.make_tag_dictionary('ner')

        # init bare-bones sequence tagger (no reprojection, LSTM or CRF)
        tagger: SequenceTagger = SequenceTagger(
            hidden_size=256,
            embeddings=embeddings,
            tag_dictionary=tag_dictionary,
            tag_type='ner',
            use_crf=False,
            use_rnn=False,
            reproject_embeddings=False,
        )

        # init the model trainer
        trainer = ModelTrainer(tagger, corpus, optimizer=torch.optim.AdamW)

        # make string for output folder
        output_folder = f"flert-ner-{hf_model}-{seed}"

        # train with XLM parameters (AdamW, 20 epochs, small LR)
        from torch.optim.lr_scheduler import OneCycleLR

        trainer.train(
            output_folder,
            learning_rate=5.0e-5,
            mini_batch_size=16,
            mini_batch_chunk_size=1,
            max_epochs=10,
            scheduler=OneCycleLR,
            embeddings_storage_mode='none',
            weight_decay=0.,
            train_with_dev=False,
        )
Downloads last month
6
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train stefan-it/flair-distilbert-ner-germeval14