Fill-Mask
Transformers
PyTorch
Danish
roberta
legal
Inference Endpoints
kiddothe2b
500k training steps with 128 tokens
5443746
|
raw
history blame
2.01 kB
metadata
tags:
  - generated_from_trainer
datasets:
  - custom_legal_danish_corpus
model-index:
  - name: danish-lex-lm-base-mlm
    results: []

danish-lex-lm-base-mlm

This model is a fine-tuned version of data/PLMs/danish-lm/danish-lex-lm-base on the custom_legal_danish_corpus dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7302

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: tpu
  • num_devices: 8
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 256
  • total_eval_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.05
  • training_steps: 500000

Training results

Training Loss Epoch Step Validation Loss
1.4648 5.36 50000 1.2920
1.2165 10.72 100000 1.0625
1.0952 16.07 150000 0.9611
1.0233 21.43 200000 0.8931
0.963 26.79 250000 0.8477
0.9122 32.15 300000 0.8168
0.8697 37.51 350000 0.7836
0.8397 42.86 400000 0.7560
0.8231 48.22 450000 0.7476
0.8207 53.58 500000 0.7243

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.12.0+cu102
  • Datasets 2.0.0
  • Tokenizers 0.12.0