AdminBERT-16GB / README.md
TSebbag's picture
update
fb2fffb verified
|
raw
history blame
668 Bytes
metadata
license: cc-by-nc-nd-4.0
datasets:
  - taln-ls2n/Adminset
language:
  - fr
library_name: transformers
tags:
  - camembert
  - BERT
  - Administrative documents

AdminBERT 16GB: A French Language Model adapted to administrative documents

AdminBERT-16GB is a French language model adapted on a large corpus of 50 millions French administrative texts. It is a derivative of CamemBERT model, which is based on the RoBERTa architecture. AdminBERT-16GB is trained using the Masked Language Modeling (MLM) objective with 30% mask rate for 3 epochs on 24 A100 GPUs. The dataset used for training is Adminset.