Fill-Mask
Transformers
PyTorch
German
bert
Inference Endpoints
scherrmann commited on
Commit
97d6359
β€’
1 Parent(s): 74100d7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -19,7 +19,7 @@ This version of German FinBERT starts with the [gbert-base](https://huggingface.
19
  ## Pre-training
20
  German FinBERT's pre-training corpus includes a diverse range of financial documents, such as Bundesanzeiger reports, Handelsblatt articles, MarketScreener data, and additional sources including FAZ, ad-hoc announcements, LexisNexis & Event Registry content, Zeit Online articles, Wikipedia entries, and Gabler Wirtschaftslexikon. In total, the corpus spans from 1996 to 2023, consisting of 12.15 million documents with 10.12 billion tokens over 53.19 GB.
21
 
22
- I further pre-train the model for 10,400 steps with a batch size of 4096, which is one epoch. I use an Adam optimizer with decoupled weight decay regularization, with Adam parameters 0.9, 0.98, 1e βˆ’ 6,a weight
23
  decay of 1e βˆ’ 5 and a maximal learning of 1e βˆ’ 4. . I train the model using a Nvidia DGX A100 node consisting of 8 A100 GPUs with 80 GB of memory each.
24
 
25
  ## Performance
@@ -51,7 +51,7 @@ Moritz Scherrmann: `scherrmann [at] lmu.de`
51
  For additional details regarding the performance on fine-tune datasets and benchmark results, please refer to the full documentation provided in the study.
52
 
53
  See also:
54
- scherrmann/GermanFinBERT_SC
55
- scherrmann/GermanFinBERT_FP_Topic
56
- scherrmann/GermanFinBERT_FP_QuAD
57
- scherrmann/GermanFinBERT_SC_Sentiment
 
19
  ## Pre-training
20
  German FinBERT's pre-training corpus includes a diverse range of financial documents, such as Bundesanzeiger reports, Handelsblatt articles, MarketScreener data, and additional sources including FAZ, ad-hoc announcements, LexisNexis & Event Registry content, Zeit Online articles, Wikipedia entries, and Gabler Wirtschaftslexikon. In total, the corpus spans from 1996 to 2023, consisting of 12.15 million documents with 10.12 billion tokens over 53.19 GB.
21
 
22
+ I further pre-train the model for 10,400 steps with a batch size of 4096, which is one epoch. I use an Adam optimizer with decoupled weight decay regularization, with Adam parameters 0.9, 0.98, 1e βˆ’ 6, a weight
23
  decay of 1e βˆ’ 5 and a maximal learning of 1e βˆ’ 4. . I train the model using a Nvidia DGX A100 node consisting of 8 A100 GPUs with 80 GB of memory each.
24
 
25
  ## Performance
 
51
  For additional details regarding the performance on fine-tune datasets and benchmark results, please refer to the full documentation provided in the study.
52
 
53
  See also:
54
+ - scherrmann/GermanFinBERT_SC
55
+ - scherrmann/GermanFinBERT_FP_Topic
56
+ - scherrmann/GermanFinBERT_FP_QuAD
57
+ - scherrmann/GermanFinBERT_SC_Sentiment