|
--- |
|
license: cc-by-nc-nd-4.0 |
|
datasets: |
|
- taln-ls2n/Adminset |
|
language: |
|
- fr |
|
library_name: transformers |
|
tags: |
|
- camembert |
|
- BERT |
|
- Administrative documents |
|
--- |
|
|
|
# AdminBERT 16GB: A French Language Model adapted to administrative documents |
|
|
|
[AdminBERT-16GB](example) is a French language model adapted on a large corpus of 50 millions French administrative texts. It is a derivative of CamemBERT model, which is based on the RoBERTa architecture. AdminBERT-16GB is trained using the Masked Language Modeling (MLM) objective with 30% mask rate for 3 epochs on 24 A100 GPUs. The dataset used for training is [Adminset](https://huggingface.co/datasets/taln-ls2n/Adminset). |
|
|
|
|