Add training procedure info
Browse files
README.md
CHANGED
@@ -66,6 +66,10 @@ Which might generate something like:
|
|
66 |
|
67 |
Same process applies. Usually, it is best to do a sliding window over the user and model turns, but keep the system prompt fixed at the start of the context window.
|
68 |
|
|
|
|
|
|
|
|
|
69 |
## Evaluation Metrics
|
70 |
The model was evaluated using EleutherAI's [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) test suite. It was evaluated on the following tasks:
|
71 |
|
|
|
66 |
|
67 |
Same process applies. Usually, it is best to do a sliding window over the user and model turns, but keep the system prompt fixed at the start of the context window.
|
68 |
|
69 |
+
## Training Procedure
|
70 |
+
|
71 |
+
This model was trained using the Metharme-v2 dataset (1 epoch) with 4x A100-40G GPUs. The run took 12 hours with `bsz=2` and `gradient_accumulation_steps=1024`.
|
72 |
+
|
73 |
## Evaluation Metrics
|
74 |
The model was evaluated using EleutherAI's [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) test suite. It was evaluated on the following tasks:
|
75 |
|