Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ license: apache-2.0
|
|
5 |
# LimaRP-Mistral-7B-v0.1 (Alpaca)
|
6 |
|
7 |
This is a version of LimaRP for [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) with
|
8 |
-
about 2000 training samples _up to_ 9k tokens length
|
9 |
|
10 |
For more details about LimaRP, see the model page for the [previously released v2 version for Llama-2](https://huggingface.co/lemonilia/limarp-llama2-v2).
|
11 |
Most details written there apply for this version as well. Generally speaking, LimaRP is a longform-oriented, novel-style
|
@@ -100,10 +100,13 @@ on 4x NVidia A40 GPUs.
|
|
100 |
The A40 GPUs have been graciously provided by [Arc Compute](https://www.arccompute.io/).
|
101 |
|
102 |
### Training hyperparameters
|
|
|
|
|
|
|
103 |
- learning_rate: 0.0003
|
104 |
- lr_scheduler: constant_with_warmup
|
105 |
- noisy_embedding_alpha: 5
|
106 |
-
- num_epochs:
|
107 |
- sequence_len: 8750
|
108 |
- lora_r: 256
|
109 |
- lora_alpha: 16
|
|
|
5 |
# LimaRP-Mistral-7B-v0.1 (Alpaca)
|
6 |
|
7 |
This is a version of LimaRP for [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) with
|
8 |
+
about 2000 training samples _up to_ 9k tokens length.
|
9 |
|
10 |
For more details about LimaRP, see the model page for the [previously released v2 version for Llama-2](https://huggingface.co/lemonilia/limarp-llama2-v2).
|
11 |
Most details written there apply for this version as well. Generally speaking, LimaRP is a longform-oriented, novel-style
|
|
|
100 |
The A40 GPUs have been graciously provided by [Arc Compute](https://www.arccompute.io/).
|
101 |
|
102 |
### Training hyperparameters
|
103 |
+
Although 1 training epoch was used, the underlying data comprised data repeated twice
|
104 |
+
in slightly different formats.
|
105 |
+
|
106 |
- learning_rate: 0.0003
|
107 |
- lr_scheduler: constant_with_warmup
|
108 |
- noisy_embedding_alpha: 5
|
109 |
+
- num_epochs: 1
|
110 |
- sequence_len: 8750
|
111 |
- lora_r: 256
|
112 |
- lora_alpha: 16
|