End of training
Browse files
README.md
CHANGED
@@ -114,9 +114,9 @@ LlamaForCausalLM(
|
|
114 |
<br/>
|
115 |
|
116 |
# Train Dataset
|
117 |
-
Trained on
|
118 |
|
119 |
-
- Num Samples: `
|
120 |
- Subset: `20231101.en`
|
121 |
- Split: `train`
|
122 |
|
@@ -163,7 +163,7 @@ The following hyperparameters were used during training:
|
|
163 |
weight=0
|
164 |
)
|
165 |
)`
|
166 |
-
- lr_scheduler: `<torch.optim.lr_scheduler.LambdaLR object at
|
167 |
- student_model_name_or_path: `None`
|
168 |
- student_config_name_or_path: `None`
|
169 |
- student_model_config: `{'num_hidden_layers': 15}`
|
@@ -178,8 +178,8 @@ The following hyperparameters were used during training:
|
|
178 |
- dataset_subset: `20231101.en`
|
179 |
- dataset_split: `train`
|
180 |
- dataset_column_name: `text`
|
181 |
-
- dataset_sample_size: `
|
182 |
-
- dataset_max_seq_length: `
|
183 |
- dataset_test_size: `0.002`
|
184 |
- dataset_shuffle: `False`
|
185 |
- dataset_shuffle_seed: `42`
|
|
|
114 |
<br/>
|
115 |
|
116 |
# Train Dataset
|
117 |
+
Trained on 553,266,374 tokens from the [wikimedia/wikipedia](https://huggingface.co/datasets/wikimedia/wikipedia) dataset.
|
118 |
|
119 |
+
- Num Samples: `998,000`
|
120 |
- Subset: `20231101.en`
|
121 |
- Split: `train`
|
122 |
|
|
|
163 |
weight=0
|
164 |
)
|
165 |
)`
|
166 |
+
- lr_scheduler: `<torch.optim.lr_scheduler.LambdaLR object at 0x76ca190e3fd0>`
|
167 |
- student_model_name_or_path: `None`
|
168 |
- student_config_name_or_path: `None`
|
169 |
- student_model_config: `{'num_hidden_layers': 15}`
|
|
|
178 |
- dataset_subset: `20231101.en`
|
179 |
- dataset_split: `train`
|
180 |
- dataset_column_name: `text`
|
181 |
+
- dataset_sample_size: `1000000`
|
182 |
+
- dataset_max_seq_length: `1024`
|
183 |
- dataset_test_size: `0.002`
|
184 |
- dataset_shuffle: `False`
|
185 |
- dataset_shuffle_seed: `42`
|
logs/dataset_max_seq_length=1024, dataset_sample_size=1000000, per_device_train_batch_size=4/events.out.tfevents.1726578027.1c1a426a2fee
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b5e963819a9110e6b02e72aacfad2c6edae23f8294c41820d65d703d34ebf4ee
|
3 |
+
size 529
|