Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ We also provide a sister model at [Clinical-Longformer](https://huggingface.co/y
|
|
11 |
|
12 |
|
13 |
### Pre-training
|
14 |
-
We initialized Clinical-BigBird from the pre-trained weights of the base version of BigBird. The pre-training process was distributed in parallel to 6 32GB Tesla V100 GPUs. FP16 precision was enabled to accelerate training. We pre-trained Clinical-BigBird for 300,000 steps with batch size of 6×2. The learning rates were 3e-5
|
15 |
|
16 |
|
17 |
### Usage
|
|
|
11 |
|
12 |
|
13 |
### Pre-training
|
14 |
+
We initialized Clinical-BigBird from the pre-trained weights of the base version of BigBird. The pre-training process was distributed in parallel to 6 32GB Tesla V100 GPUs. FP16 precision was enabled to accelerate training. We pre-trained Clinical-BigBird for 300,000 steps with batch size of 6×2. The learning rates were 3e-5. The entire pre-training process took more than 2 weeks.
|
15 |
|
16 |
|
17 |
### Usage
|