Update README.md
Browse files
README.md
CHANGED
@@ -100,3 +100,19 @@ At the time of release, MedGENIE-fid-flan-t5-base-medqa is a new lightweight SOT
|
|
100 |
| LLaMa-2 <small>([Liévin et al.](https://arxiv.org/abs/2207.08143))</small> | ∅ | 0-shot | 13B | 31.1 |
|
101 |
| GPT-NeoX <small>([Liévin et al.](https://arxiv.org/abs/2207.08143))</small> | ∅ | 0-shot | 20B | 26.9 |
|
102 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
100 |
| LLaMa-2 <small>([Liévin et al.](https://arxiv.org/abs/2207.08143))</small> | ∅ | 0-shot | 13B | 31.1 |
|
101 |
| GPT-NeoX <small>([Liévin et al.](https://arxiv.org/abs/2207.08143))</small> | ∅ | 0-shot | 20B | 26.9 |
|
102 |
|
103 |
+
### Training hyperparameters
|
104 |
+
|
105 |
+
The following hyperparameters were used during training:
|
106 |
+
- learning_rate: 5e-05
|
107 |
+
- num_devices: 1
|
108 |
+
- n_context: 5
|
109 |
+
- per_gpu_batch_size: 1
|
110 |
+
- accumulation_steps: 4
|
111 |
+
- total_steps:
|
112 |
+
- eval_freq:
|
113 |
+
- optimizer: adamw
|
114 |
+
- scheduler: linear
|
115 |
+
- weight_decay: 0.01
|
116 |
+
- warmup_steps:
|
117 |
+
- text_maxlength: 1024
|
118 |
+
|