File size: 668 Bytes
dc54080
 
 
 
 
 
 
 
 
 
 
 
 
e0a5384
 
fb2fffb
e0a5384
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
---
license: cc-by-nc-nd-4.0
datasets:
- taln-ls2n/Adminset
language:
- fr
library_name: transformers
tags:
- camembert
- BERT
- Administrative documents
---

# AdminBERT 16GB: A French Language Model adapted to administrative documents

[AdminBERT-16GB](example) is a French language model adapted on a large corpus of 50 millions French administrative texts. It is a derivative of CamemBERT model, which is based on the RoBERTa architecture. AdminBERT-16GB is trained using the Masked Language Modeling (MLM) objective with 30% mask rate for 3 epochs on 24 A100 GPUs. The dataset used for training is [Adminset](https://huggingface.co/datasets/taln-ls2n/Adminset).