|
--- |
|
language: de |
|
license: cc-by-4.0 |
|
tags: |
|
- named-entity-recognition |
|
- legal |
|
- ner |
|
datasets: |
|
- elenanereiss/german-ler |
|
metrics: |
|
- precision |
|
- recall |
|
- f1 |
|
model-index: |
|
- name: elenanereiss/bert-german-ler |
|
results: |
|
- task: |
|
name: Token Classification |
|
type: token-classification |
|
dataset: |
|
name: elenanereiss/german-ler |
|
type: elenanereiss/german-ler |
|
args: elenanereiss/german-ler |
|
metrics: |
|
- name: F1 |
|
type: f1 |
|
value: 0.9546215361725869 |
|
- name: Precision |
|
type: precision |
|
value: 0.9449558173784978 |
|
- name: Recall |
|
type: recall |
|
value: 0.9644870349492672 |
|
pipeline_tag: token-classification |
|
widget: |
|
- text: "Herr W. verstieß gegen § 36 Abs. 7 IfSG." |
|
--- |
|
|
|
|
|
# bert-german-ler |
|
|
|
## Model description |
|
|
|
This model is a fine-tuned version of [bert-base-german-cased](https://huggingface.co/bert-base-german-cased) on the |
|
[German LER Dataset](https://huggingface.co/datasets/elenanereiss/german-ler). |
|
|
|
## Intended uses & limitations |
|
|
|
to do |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 1e-05 |
|
- train_batch_size: 12 |
|
- eval_batch_size: 16 |
|
- max_seq_length: 512 |
|
- num_epochs: 3 |
|
|
|
## Results |
|
|
|
``` |
|
eval_loss = 0.020239440724253654 |
|
eval_accuracy_score = 0.9953227664227791 |
|
eval_precision = 0.9212203128016991 |
|
eval_recall = 0.9458762886597938 |
|
eval_f1 = 0.9333855032769246 |
|
eval_runtime = 111.4147 |
|
eval_samples_per_second = 59.875 |
|
eval_steps_per_second = 3.743 |
|
epoch = 3.0 |
|
``` |
|
|
|
``` |
|
test_loss = 0.011871221475303173 |
|
test_accuracy_score = 0.9969460436964865 |
|
test_precision = 0.9449558173784978 |
|
test_recall = 0.9644870349492672 |
|
test_f1 = 0.9546215361725869 |
|
test_runtime = 111.5143 |
|
test_samples_per_second = 59.849 |
|
test_steps_per_second = 3.748 |
|
``` |
|
|
|
### Usage |
|
to do |
|
|
|
### Reference |
|
``` |
|
@misc{https://doi.org/10.48550/arxiv.2003.13016, |
|
doi = {10.48550/ARXIV.2003.13016}, |
|
url = {https://arxiv.org/abs/2003.13016}, |
|
author = {Leitner, Elena and Rehm, Georg and Moreno-Schneider, Julián}, |
|
keywords = {Computation and Language (cs.CL), Information Retrieval (cs.IR), FOS: Computer and information sciences, FOS: Computer and information sciences}, |
|
title = {A Dataset of German Legal Documents for Named Entity Recognition}, |
|
publisher = {arXiv}, |
|
year = {2020}, |
|
copyright = {arXiv.org perpetual, non-exclusive license} |
|
} |
|
|
|
``` |
|
|
|
|