File size: 3,964 Bytes
597f536 b73fdae 597f536 7f33d97 c14b676 9f30b29 c14b676 9e761cf c14b676 9f30b29 d6a4c59 9f30b29 c14b676 9f30b29 c14b676 9f30b29 c14b676 9e761cf c14b676 9e761cf c14b676 9f30b29 c14b676 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
---
datasets:
- eduagarcia/LegalPT
- eduagarcia/cc100-pt
- eduagarcia/OSCAR-2301-pt_dedup
- eduagarcia/brwac_dedup
language:
- pt
pipeline_tag: fill-mask
tags:
- legal
model-index:
- name: RoBERTaLexPT-base
results:
- task:
type: token-classification
dataset:
type: eduagarcia/portuguese_benchmark
name: LeNER
config: LeNER-Br
split: test
metrics:
- type: seqeval
value: 90.73
name: Mean F1
args:
scheme: IOB2
- task:
type: token-classification
dataset:
type: eduagarcia/portuguese_benchmark
name: UlyNER-PL Coarse
config: UlyssesNER-Br-PL-coarse
split: test
metrics:
- type: seqeval
value: 88.56
name: Mean F1
args:
scheme: IOB2
- task:
type: token-classification
dataset:
type: eduagarcia/portuguese_benchmark
name: UlyNER-PL Fine
config: UlyssesNER-Br-PL-fine
split: test
metrics:
- type: seqeval
value: 86.03
name: Mean F1
args:
scheme: IOB2
license: cc-by-4.0
metrics:
- seqeval
---
# RoBERTaLexPT-base
RoBERTaLexPT-base is pretrained from , using [RoBERTa-base](https://huggingface.co/FacebookAI/roberta-base), introduced by [Liu et al. (2019)](https://arxiv.org/abs/1907.11692).
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Funded by:** [More Information Needed]
- **Language(s) (NLP):** Brazilian Portuguese (pt-BR)
- **License:** [Creative Commons Attribution 4.0 International Public License](https://creativecommons.org/licenses/by/4.0/deed.en)
### Model Sources
- **Repository:** https://github.com/eduagarcia/roberta-legal-portuguese
- **Paper:** [More Information Needed]
## Training Details
### Training Data
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
[More Information Needed]
### Training Procedure
The pretraining process involved training the model for 62,500 steps, with a batch size of 2048 sequences, each containing a maximum of 512 tokens.
This computational setup is similar to the work of [BERTimbau](https://dl.acm.org/doi/abs/10.1007/978-3-030-61377-8_28), exposing the model to approximately 65 billion tokens during training.
#### Preprocessing [optional]
[More Information Needed]
#### Training Hyperparameters
| **Hyperparameter** | **RoBERTa-base** |
|------------------------|-----------------:|
| Number of layers | 12 |
| Hidden size | 768 |
| FFN inner hidden size | 3072 |
| Attention heads | 12 |
| Attention head size | 64 |
| Dropout | 0.1 |
| Attention dropout | 0.1 |
| Warmup steps | 6k |
| Peak learning rate | 4e-4 |
| Batch size | 2048 |
| Weight decay | 0.01 |
| Maximum training steps | 62.5k |
| Learning rate decay | Linear |
| AdamW $$\epsilon$$ | 1e-6 |
| AdamW $$\beta_1$$ | 0.9 |
| AdamW $$\beta_2$$ | 0.98 |
| Gradient clipping | 0.0 |
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
### Testing Data, Factors & Metrics
#### Testing Data
<!-- This should link to a Dataset Card if possible. -->
[More Information Needed]
#### Metrics
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
[More Information Needed]
### Results
[More Information Needed]
#### Summary
## Citation
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
[More Information Needed]
|