metadata

datasets:
  - eduagarcia/LegalPT
  - eduagarcia/cc100-pt
  - eduagarcia/OSCAR-2301-pt_dedup
  - eduagarcia/brwac_dedup
language:
  - pt
pipeline_tag: fill-mask
tags:
  - legal
model-index:
  - name: RoBERTaLexPT-base
    results:
      - task:
          type: token-classification
        dataset:
          type: eduagarcia/portuguese_benchmark
          name: LeNER
          config: LeNER-Br
          split: test
        metrics:
          - type: seqeval
            value: 90.73
            name: Mean F1
            args:
              scheme: IOB2
      - task:
          type: token-classification
        dataset:
          type: eduagarcia/portuguese_benchmark
          name: UlyNER-PL Coarse
          config: UlyssesNER-Br-PL-coarse
          split: test
        metrics:
          - type: seqeval
            value: 88.56
            name: Mean F1
            args:
              scheme: IOB2
      - task:
          type: token-classification
        dataset:
          type: eduagarcia/portuguese_benchmark
          name: UlyNER-PL Fine
          config: UlyssesNER-Br-PL-fine
          split: test
        metrics:
          - type: seqeval
            value: 86.03
            name: Mean F1
            args:
              scheme: IOB2
license: cc-by-4.0
metrics:
  - seqeval

RoBERTaLexPT-base

RoBERTaLexPT-base is pretrained from , using RoBERTa-base, introduced by Liu et al. (2019).

Model Details

Model Description

Funded by: [More Information Needed]
Language(s) (NLP): Brazilian Portuguese (pt-BR)
License: Creative Commons Attribution 4.0 International Public License

Model Sources

Repository: https://github.com/eduagarcia/roberta-legal-portuguese
Paper: [More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

The pretraining process involved training the model for 62,500 steps, with a batch size of 2048 sequences, each containing a maximum of 512 tokens. This computational setup is similar to the work of BERTimbau, exposing the model to approximately 65 billion tokens during training.

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Hyperparameter	RoBERTa-base
Number of layers	12
Hidden size	768
FFN inner hidden size	3072
Attention heads	12
Attention head size	64
Dropout	0.1
Attention dropout	0.1
Warmup steps	6k
Peak learning rate	4e-4
Batch size	2048
Weight decay	0.01
Maximum training steps	62.5k
Learning rate decay	Linear
AdamW $$\epsilon$$	1e-6
AdamW $$\beta_1$$	0.9
AdamW $$\beta_2$$	0.98
Gradient clipping	0.0

eduagarcia
/

RoBERTaLexPT-base

RoBERTaLexPT-base

Model Details

Model Description

Model Sources

Training Details

Training Data

Training Procedure

Preprocessing [optional]

Training Hyperparameters

Evaluation

Testing Data, Factors & Metrics

Testing Data

Metrics

Results

Summary

Citation