Transformers
PyTorch
Graphcore
English
groupbert
Generated from Trainer
Inference Endpoints
Ivan Chelombiev commited on
Commit
bf20149
1 Parent(s): cb0616e

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -5
README.md CHANGED
@@ -1,8 +1,6 @@
1
  ---
2
  tags:
3
  - generated_from_trainer
4
- datasets:
5
- - Graphcore/wikipedia-bert-512
6
  model-index:
7
  - name: output-pretrain-groupbert-base-phase2
8
  results: []
@@ -13,7 +11,7 @@ should probably proofread and complete it, then remove this comment. -->
13
 
14
  # output-pretrain-groupbert-base-phase2
15
 
16
- This model was trained from scratch on the Graphcore/wikipedia-bert-512 dataset.
17
 
18
  ## Model description
19
 
@@ -42,7 +40,8 @@ The following hyperparameters were used during training:
42
  - total_eval_batch_size: 20
43
  - optimizer: LAMB
44
  - lr_scheduler_type: linear
45
- - training_steps: 1
 
46
  - training precision: Mixed Precision
47
 
48
  ### Training results
@@ -53,5 +52,5 @@ The following hyperparameters were used during training:
53
 
54
  - Transformers 4.20.1
55
  - Pytorch 1.10.0+cpu
56
- - Datasets 2.2.2
57
  - Tokenizers 0.12.1
 
1
  ---
2
  tags:
3
  - generated_from_trainer
 
 
4
  model-index:
5
  - name: output-pretrain-groupbert-base-phase2
6
  results: []
 
11
 
12
  # output-pretrain-groupbert-base-phase2
13
 
14
+ This model was trained from scratch on the None dataset.
15
 
16
  ## Model description
17
 
 
40
  - total_eval_batch_size: 20
41
  - optimizer: LAMB
42
  - lr_scheduler_type: linear
43
+ - lr_scheduler_warmup_ratio: 0.15
44
+ - training_steps: 2038
45
  - training precision: Mixed Precision
46
 
47
  ### Training results
 
52
 
53
  - Transformers 4.20.1
54
  - Pytorch 1.10.0+cpu
55
+ - Datasets 2.6.1
56
  - Tokenizers 0.12.1