diff --git a/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/README.md b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/adapter_config.json b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/adapter_model.safetensors b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..05158166ff25e26ac4725f976061befc623ffc9e --- /dev/null +++ b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5b9043cda8563d5c0b2318f0b2fff17e5ae4453ee006dcd1352fedb6c6633055 +size 67144544 diff --git a/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/training_args.bin b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/README.md b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/adapter_config.json b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/adapter_model.safetensors b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..46e057279b82880a6790abbd45f01cfc7994f52c --- /dev/null +++ b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9390098b99a81f26fa38bbe981db80eba9ff50df23f7ff415e967e317c8606ee +size 67144544 diff --git a/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/training_args.bin b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/model_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..954cb4135fea269dbaac0231adaa30ef41e50f21 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:afd7f61c3ba2dad32e6867d05a7c3be17fcf984d4bc7bec4f0e3a4b31885fa22 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..62dc72e2ba1f073e67f08c61d81cc4e4b80c1d87 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0a98b7e61224fe0c2b44065afddaafb3c8bb516638455135a694436c2b8672ac +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..d0cb160fc6752dc0470bb88b1ba16dca7ed969ca --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fd418aa175a4f9508778329e5c11f54241882ad7316c344103bc3804e613599f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..c7c7ba5a5d73c30d2e2dfccf92552709b61b1a0f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d4f7e5b3f15e6248eb69742a14f905c700ecf357f80b4e2f91b8b83b2a38d15e +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..8c5e85e5d5660dd81c4770ecf163b8282219e665 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/trainer_state.json @@ -0,0 +1,48 @@ +{ + "best_metric": 2.462397813796997, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10", + "epoch": 0.13333333333333333, + "eval_steps": 10, + "global_step": 10, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1638607198617600.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..3e2200fee38eb0a4fb3ed29b30d9c09335ca0d1d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8090a746e03d6771ce51a4f69d4f7419588db1ae56ed6fb72dfef320583626cf +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..d84b61d4d630721eb8038dbb4a0930f606ece3a3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ae4c8b92130b951507242330b26d765d62409302f7f9e18fd267e763f524f706 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..e6cdf36295b4d559507cf0b068680edea3de3a81 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:46513e9b1de488f3d70a4461303e6b827989f588807354e14d010b7ee4f4679f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..4f1a24bb7d4e46bd15c0b55412cc8ba9b9556c35 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cd2ccdaca083e589c09bcd97757fde390a191ed5c643ace13a70b750fd4a4e4b +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..6a2dd128bdc3efd728a1e5ad3b6ff1aaf4bcd63d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/trainer_state.json @@ -0,0 +1,183 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.3333333333333333, + "eval_steps": 10, + "global_step": 100, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.6386071986176e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..bcf78c3400bc4f6af9b132d44253fc1255c5dd4c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ce525d2c36598a2956de71ece153206132d2b09e391e8aea266e0ae2de3e1796 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..0a05335351028aff079ff03ef9363d1a8b7659ea --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a86b35a4be7e6f5ba97b82837fb589bafa09846478a3b0bbf0114c22123e773c +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..e8b03e39b0cf81b4b723b9421b9fca8f87c7b414 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:319884e2d6c1fad0795ced8add37e8073910c77073120da512a5e6a1f6208d62 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..24dce4e18218617e13af9f93046f397a711717c2 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fe938637817d41932e7175fe8d9bcdaa1f1383328b73e4b56a4e373476a295ba +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..a4e42c22d7892be75856704f08a44c2101aeb061 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/trainer_state.json @@ -0,0 +1,198 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.4666666666666668, + "eval_steps": 10, + "global_step": 110, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.80246791847936e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..d5454187ebc1e3f453316a99006137cb3c4d63df --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dda2b971176b1492656fa28493d58c326dbc1e9f68f056eaef616bfe1107eeb3 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..d8b3e2df6a08c3ed4cbf2e40b58b15b49f0cec01 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2d97f0a3cc1e1c8b5bbde0b5151122a853076b7a2afaef96634d6bf8914273cc +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..71b7a5227226dcaeadffec096acbc7df0f632989 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3500ac793bd5f15c49da717801f854f9815260499ab4bc16b8f3a1ca9c82dfdf +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9eb62bb8ba22966a1e254979e1d2479886d174dd --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:145815d6a6480fb85323e9a0f9a98f3e8faa57003487fcac0be85abbf27b4575 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..a9ee5b192b4bbe695921b01f60072e8efcfe9a12 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/trainer_state.json @@ -0,0 +1,213 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.6, + "eval_steps": 10, + "global_step": 120, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.96632863834112e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..616cc18c1a2f57f37a478c67abce401e29726fa5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:42bf0417643eadb1fd7ca87b1e33f16ecbeadf0092845ae324f9f4b5cf3c077f +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..ee0b1894e3f725d5a9d7dea0e3ca9c531e1abdd9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:85e18d9537fa23d5a368ccf3461c517e1eab4991aaf92f327c5a4e3190e23177 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b60cc4cb8217ae694c7a8efef0eb0b676d897e83 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:602f503f7cd2e84c0b6719714b66d34e98b340f44b02ba8ffc44df096e786100 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..0dae8e46aca4beacf0c154c37d71abe175363a25 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:abdc7730bfbf0869132cbbd456c580122a20a540399e30640d4e51daf6f379d3 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..238e20a1b27b2416c72041d7fdfcef69ef8dac9a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/trainer_state.json @@ -0,0 +1,228 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.7333333333333334, + "eval_steps": 10, + "global_step": 130, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.13018935820288e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..718a584244a0f1ac9a55c4f6c705a44553ab5f43 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9943b61a33b742c20ccd0d68a70b1f4f04b0d9b2aaac9737255f3e3ce4f40a2a +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..6fd3633ebce596b79f8cd4077c5a991ca593474e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1d63119c73affcc4e4cd955ef25f0e4395d013ff03d507f3b5e873059ea301cb +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..d05f19f3c7e1e4b728f62f56852d18785b6ab4d0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:03c218af617af689aa7eff2d02ae91fb859e96fcb9571b641c5e95247f137dda +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..b77086a6cbb29f3cd0e1ac947f6c71c390b2dff3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:21a6935970b037ba9fc4b9dc75dbda421fb162f0fa5b7d5502a5e9660c005897 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c464ad5df8b99b8291613095469fbbaccbd471d0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/trainer_state.json @@ -0,0 +1,243 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.8666666666666667, + "eval_steps": 10, + "global_step": 140, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.29405007806464e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..bb9ab83bd562e8c7cf74236704edb8094f13ee0f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8e6e4a3e7bc03f2f2ebb2ef92730e5074d0584a64142c3bbd9c02cf066565a5a +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..ba354b4e62206943a1b03cb30fb163089260337a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:95148fcf3cf8553150f887607fdb11e0733577006bb2ef743b3e9b4a4098a6ff +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..61dde1ed8b180510bbda84f0c71356862600ad55 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bdf2188bfe5b1127367f0a0d0628c845d9f54239950b10ed26be9372dba68d0b +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e9d3263bcfa5d62a56c74c931026d6e1762a1781 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0d75316f47d5ef08dad7230d3c189fb5ad736372bf2da793895c59a4ccba811f +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..0199951e11230ca8ca662ef7702935f95e12535a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/trainer_state.json @@ -0,0 +1,258 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.0, + "eval_steps": 10, + "global_step": 150, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.4579107979264e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..a745cda1fa857a066b1b069df154b59351008282 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:468c9267e4e81cfa6c3ed81ecc9053831e79d4c46105870f8bae20bd89671070 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..5487eeea5f4b8f347714aa6c501c6fdae7f2c865 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:986cf5fbf04ed05e812ea675c17bbadab9b3ffabad93483018f6f7d81b2b2173 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..564fc6da8e7c6b2c0f5b62f1f2e55b96ec29c066 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0f1a4ff62819275ae908067e10e49db3630270d7e753db72e5d286184508926f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..333a8435179bb1a27e74cf71169524425347df64 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3c60f731d4cb1d489de80d48b0d2bf2049ddfec30c083dac3c65e6fc26b9708e +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..084049683e3a32ea0b228dfcffc172c5a0eb0414 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/trainer_state.json @@ -0,0 +1,273 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.1333333333333333, + "eval_steps": 10, + "global_step": 160, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.62177151778816e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..d8aeed09adbfc6d342b7ba8572f991134178de51 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0f61a9b3e209d8e2fd71ae5566837e0395741629e065944ddbc89fe11b7fe304 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..18e3ca0103f2ae2cfedc13fd89b1cf605b3124cb --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:be8d32d197fc6abd67ee70bc8473b5ade5a514b364d09feb634b0f9995f4853a +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..c13cd397e2cbe97d2fb9e944d382c58418c6b136 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:964f6178720317ac51eb375c889b2d86c7184aa024caf52b59339853ffae03ca +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..98392616735ef4e842735f8fdb0443dd62c47cc3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3f8316c64c3f1dcba9f5f78f5461a5450278d6310afba0a2471aa470b51e14fa +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..d961f9083d2d6c1d6a97acc71cbde68f64512aaf --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/trainer_state.json @@ -0,0 +1,288 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.2666666666666666, + "eval_steps": 10, + "global_step": 170, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.78563223764992e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..245824c3e5db67705b26825bf03f8384a688b9a7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a58d5b8ca385191530fd8f99492995c9f810f3b4660df68ea3dfdc1985e419db +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..2d687161a35dc2829e3ad2b55df8f5eac0198461 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f7abdee3b94580e96d58e3bf62c257455a14816e17447f9c0b04fb60d771eddc +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fdca3aeb31ce5b4aeb2c0f2ba53e3e43b6334331 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1b79baa0842c2916b082cba36f9f2b958210e6d7c1813742841fb908cae57fbd +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..4c07d6d39c8000e4887811925b35913c0d0fb9e7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c5c510c48cbd7d4a31b049b9ce577d9a61337bf5b3120da8df24159e22a5b61b +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c7680d62fae50a4ee9ac6888f964946a6dd85e2f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/trainer_state.json @@ -0,0 +1,303 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.4, + "eval_steps": 10, + "global_step": 180, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.94949295751168e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..b3a3d96cd658f83ba6af432998f7c59a210b9972 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4becb2c7fa248bba9a01f4b0b8d33431956d5d851efda9a1a83b92b1323cd8a9 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..57b1556e224119d9062a4103bce996d1e224ea8f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4ef0d08ab022a76e03b5ec3bfc8568c39a501231100b7abf3a9b7e6db9d49c3d +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ae44ad6727cf9b3af903ea84902fa6c7f13a5a95 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7d6f4346bdc8a12fcc48535a6002ac46345e4ce1e14bb1f7e9dc3b0ea920641c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..c8d96687f829fbdebf86c73104630c11643191e8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:56d80eccd9a2998f395870ad7a48e8df26a0ef5fdd75c8bb18466e506b523f6a +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f536e2eb9081bcc3ca4bdfe1a9ca538b9484ad11 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/trainer_state.json @@ -0,0 +1,318 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.533333333333333, + "eval_steps": 10, + "global_step": 190, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.11335367737344e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..44e18f9b566965b067596338905091027df102e8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:45934df6dcd9429dc072997c5ffe54b206621f54ffd535333a1e1ea333ae6821 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..b8c83492bafc2f744f248030476b0e31e9730066 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:89a2b431bc09118dc22a555b344d158feff28c599722620797a0bb40466f7033 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fe515b4492af517bd45c5a5c7abbba2b94c5ae37 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1a5087ba42b4dd9dc68875c89890b692068c71de7009ff67cb7d8492bce11049 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..61e40aef0a507fb8add486ba2535aadaa164b9a7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:72a91e63074e9f0fdfc6b1e7414643f389732ccfdfe97b6b3f4c5b0d7a7556a4 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..8f64570b6ff683b9fe8d69348f75fa44c2f50b29 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/trainer_state.json @@ -0,0 +1,63 @@ +{ + "best_metric": 2.461599588394165, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20", + "epoch": 0.26666666666666666, + "eval_steps": 10, + "global_step": 20, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3277214397235200.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..4ef077c2e9bcbc840308313c5e4fc2a422c1000a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fd33463647f0327e0807205e2a8170f2c530abb74aba43fb70ffb23e081845b6 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..f1ffed897f10d98723b527806828127726037a5e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dcb87a209a57842213da7ad86f806dca3b1a8b1b07b9ad9464216ed234235115 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..da263858f32b7536e68a33626ef41e3ef7a44689 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0dbbe288070e588c7effbe11249d330a3ad16131211e6b5dff1d03a8ebc7517f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..0fc1f2bea0ca1c9908bf307e3525efa76fd70425 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e5b7dec72c2b7f015512ea839980ec16d0582c7e6d0689dad8794261e73838b6 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..6b8f6a89a7b23cdd46c401c5f31e28809a4c0882 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/trainer_state.json @@ -0,0 +1,333 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.6666666666666665, + "eval_steps": 10, + "global_step": 200, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.2772143972352e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..1e951e8011a00660406e599378ef18dc0e06ac10 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:92b96f5f47da443e6cd249b05a5192142fe55593ba6a69514ae492fb1c2a727a +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..ce750f786f7c6c6112e0ecf635491379019559e5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bc9ebd6b5fcbcb3babe00d7649d35de2a434c64a26c70e47d785304134942158 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..605214081e6b3060d6c3e526fc86e8b8fff3c71b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cd4e0019fadc179e2ea531ff33d86db759cb80e64a8826bb6bfa90c2483bfc04 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3b44a2b0d3df617f15242e2d4ea4d5553b544573 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6c59d7cb173602f981a42f5fe61d72e03c87c9f97f456afe9fd66cd09957f177 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..8d5fe87260ffcc0a48eb66835db4c25f667ef01e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/trainer_state.json @@ -0,0 +1,348 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.8, + "eval_steps": 10, + "global_step": 210, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.44107511709696e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..07e564986a561288f1308a506e4d8a59abc2a18c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6b95ff71de6a231e4e9be8fcf4037fff497b2cce5b14ad2a996f1ba8ad9dca00 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..e3eede7eef533cbc1dfc26e8a7c8543d02a152d0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1b6084bce7de97bad29aa854448dcd5bcc388e51e85347373320e49628179291 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..823c878e3ad7d7799e1959fba97c90aaf79af4f9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f5e4256f7b7ace2dd6194570c191ab9026456dc0db24025edac4a5bd9e379dab +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..63906caa8cd7e3fc0686b7d0276e496942ef0036 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:daeceb22ea0c54e6923c8a042a9cfc5a5bc826f201c52f29454b62c289d49dc6 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..4586a7c7528ad2b80e42685615e2a868d9f949ff --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/trainer_state.json @@ -0,0 +1,363 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.9333333333333336, + "eval_steps": 10, + "global_step": 220, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.60493583695872e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..0bcf20d2566af32175ff715bbc5f96f038664a70 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9ac78443cc578d8d3c5891a11ab3e23120aaf80cbc999fcd2acc3439d81a9375 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..338065f74b5fbd1de2fef9cdb996dbbba7b7d585 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:225105c9b36f111d22cea2efabd77282c3f139ff5ccbe308627025a83652202d +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ae85ad205796b2c3955218eb7b4b348ca35978c7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3e2b38199e26ee1965ef79aea019c0217039e7dab109a4b6e29c57f1bea63d6d +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3e92e5593d8d7139e837b2a75209a41c074c2e8c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d7f755a0bef74517fb45fc39d7689eaec499187cc5cd60002751078b0276b353 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..dd564c8b387437e182a278e439e94ee79b5e661f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/trainer_state.json @@ -0,0 +1,378 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.066666666666667, + "eval_steps": 10, + "global_step": 230, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.76879655682048e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..1ba0efe0eb55a97802639bff4e69cbdd18107d1c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:67436bbc75916b8fcbd006562757200b5bfd511a22b7daecdbd459fb70500b46 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..04bf9154cb588c732234d504a9c7460a14fcbbdc --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3c287de9866264a811110ae43766fc4511b49bcdd5e0825d11227183354b698b +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..846c31e0418b3b3196b4e9c5d730a866c947d1d6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:33d7857a6e3603508425c326c1a1dee439799d2c72bbfc8afcabbb8578757780 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..31e05b86275fed970cdeadc24115c84e19feae09 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f73703efe567bf60e5ab219b736abd5d1183aaab558b64454b92f8bc5cf1b3fd +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f68e0e800a7aea60e6f9c0a61ad1a40ebdce06ec --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/trainer_state.json @@ -0,0 +1,393 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.2, + "eval_steps": 10, + "global_step": 240, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.93265727668224e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..9f0985426442299b858aca11bf8b863746b1fc00 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6c48b959b5ea19475833f8199777ee714c285270d1b0073119fb17972833d1c3 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..aca1431b075f117ccdb481dd6ae8a76544e51f75 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6ef10fa5c813675a1fead5161067a053d14bf24701cc04efa1822270cca83acd +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..90df82c0a610ae490c2592c79d46fe23cde8d351 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1a5b7a10b9f8de84d4eac8f0b5437669695e0a3ed004e055b39340577de17c55 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..174b5438f88f4c3c799b43c4f559ca991fb938b4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7ef310c01f40cba8e9c44af8332d1cb681a7026399804fa2296ed59c6594e708 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..86c7653caecfb0b9e911d6b4f05e794e5f1eddd0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/trainer_state.json @@ -0,0 +1,408 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.3333333333333335, + "eval_steps": 10, + "global_step": 250, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.096517996544e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..27b80c9cb7432c458dc505639219c86ca822a0aa --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a7cb84b44e4595c8e4314157e36a52f6e9b83ec7ae83ce90896b595e94e769ae +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..b43649b967fe0b9e70c27adb22cdf460dc6123d8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9d75830d3ae0afa6d8739f82913d1f0586de9f5886e66c49d3387bbb55c0d0ec +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..293d181974003fee2540af0648cfb4e42786ca56 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:78bbc69e88d5e1fb15138660b4de76d03b9476fa1ab2d16370f894a65eab3da3 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9431a5b0a8e3a7cfc7a6acff3f3ba51f0ea91b16 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e388642b0db2b68dfc847810d17830763a6c1ccd5a0a2c34607435281dfa7f25 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..bfa78e5d36c3aff16e500cf65d65e372e5ed8758 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/trainer_state.json @@ -0,0 +1,423 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.466666666666667, + "eval_steps": 10, + "global_step": 260, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.26037871640576e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..160cb8a2e27e15fb679b7a3ca402910a2e388d15 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:595c18b8a7eb6e58b4d76a48fbce1e1bf99d8de09026bd024200945bcc0b3e15 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..06816f640f163d22f8c8de1b6664ff7cdef703b6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7296a8b96c5799256710f150512c8110219c204f679595a096495b9f2b682001 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ba62c782c818c1b90b0344e262a00bb91255dc87 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2af2c0de08ddef877a4af0e5f2dfe4570d2f029659f125fbfe3bbcce3a8b09e6 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..ed00d4e9803635011eee9bbdae275cac04953c1b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9b86f25c4fadc98da61c18896b4c25ab399b3a23b766274b50979d4340358b17 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..73a97925f4820f5178a8f5d905517a7e030b57d7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/trainer_state.json @@ -0,0 +1,438 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.6, + "eval_steps": 10, + "global_step": 270, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.42423943626752e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..d6d1635e4445735679c19e3006b9d2c88eea8adc --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:48daa57d6ab1e696f5d048bbe9e3eb8689b08275c82c1cea967bc2d3cd7ef3e5 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..581be056fd2fcd63de5449a2531997d7fb67ed75 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e611331a58c06b89f09d685e8a47fad1b27a4f27dc7913d1751b5ffd846d0e5a +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..1702f62666b39cac633a34cf312f24e311e13df2 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9ba79aaff190fd3ef9f70dd7c0a234665c2bd6c6bb243b5896c5bd6a16356627 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..5e2ffe406e2d87ea70e25bfbdad4187edda05acb --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3d68cb0fb8d225e623592feefec72ecd0b7071657fb56415f262582b52279a56 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..bb0a1c1f8beac24693baf9e127004cbbc3d9481e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/trainer_state.json @@ -0,0 +1,453 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.7333333333333334, + "eval_steps": 10, + "global_step": 280, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.58810015612928e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..ff23632732fe237a9ccffd541794093f30e6991d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:de77cd323ef71f2b2ecdf7e92581bb5c14f14f9e3df0c4535db2a24be31817bc +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..d639f67a6f410f0a600adf7ae93bb4970d5007cc --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3c41d61ac8c4d45e59ec5769efd2b4a13197767ba1c3a51f7461eb12170f7bee +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fecfedbf1488a31afeaf7c01dc4f9760cfff1b16 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:47c6345b8afbd1f7a687e942ce33ce022660a29cb46a23e4c9eda9e498053741 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..1da17016e7f80351316298af3ab35d6cc666d60f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:518f59f6861d3d54674180d781456c4d55d82eb1d5543c592846efd5b6bea3ea +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..0fd13afc489cf1c8341c608f4ac56f331b776228 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/trainer_state.json @@ -0,0 +1,468 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.8666666666666667, + "eval_steps": 10, + "global_step": 290, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.75196087599104e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e4cc03009f5aa93a2eff44f46775f73657b1d045 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:77b628c3ffff168d2db2e610d9b4d647a0313e4485d3022c6412de9a2541388f +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..43a9e07cd424265376ea30b35cf4af8737878a0b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:43af90c651b1b5beb0035db25647643514124e6d3e3969f79d34db03d2874e41 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..76ee62462f7b8b87edaf24539d12d81995c70164 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3a5478e4e53ebdf948038ed344f6e976416991ec94630cb094a18d5adf7aae7a +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8e3204abc81bf616d4220ccab7f0f13520ce949e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:19debbf018dbf40b240b0a2ef65d5d10de2fa92e61c8838b0319c8c96ad962cd +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..e3b40279420fdd38b7be3fd67c1ad82a4fbcfd89 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/trainer_state.json @@ -0,0 +1,78 @@ +{ + "best_metric": 2.4604127407073975, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30", + "epoch": 0.4, + "eval_steps": 10, + "global_step": 30, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4915821595852800.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..800a517eb58ab59ecfd2581e7572c251f7c62681 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7ad12b33418b76e6969d368dd7eaa8ad5d023894a594ff1a037e0c6246008ec5 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..d7c93238534d1af0f73a2107435181a2abfa386d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f09601820fb625c47cb1fdd27a824e3a110493346a4c9ff0992a6a5a85939f7c +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4d8ba268ef07796e970a23442889935701a1dda5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2574c6149307e492ef05d2031918a546356cc654f4671c817f05ae6d0764de7f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8640fdd49a163110aee721e1510c7d552b4242d7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:30336c219d20749546325363bfa0b5ee5e9d4b073a303024ff3ad347834b8c13 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..ed06273b93bb1ff7db3e9c3128d42c0ad1f0c219 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/trainer_state.json @@ -0,0 +1,483 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.0, + "eval_steps": 10, + "global_step": 300, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.9158215958528e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..aff68c92443c4a55b0809c077a7644683159f37c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fe4f2b13bacf55b8f7424b10ad2cb1fcc63e93fbb40f05e5fac48fd44e735569 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..655e67896975a1fc2a3db419581a10afa23a96bc --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8f2b79a289514f9fb14c7546371646c0819f2c0d8ab4955977ffb148e8a725a9 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..a5b4503b006d8dec33c7a086d3d007eef4282144 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a82d768c5f5c231c8b50481a409281b8639e231a185281a7476164488eb6c27f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..58f0265f6abd6b6684c5edee08f03cf244492dc5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f67fc10f846f52b9c0359f08a436d3ebec080f189f60c98def04956b2dc83cd7 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..878342a045f2476be92d35f1c71b13a45ae1b88e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/trainer_state.json @@ -0,0 +1,498 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.133333333333334, + "eval_steps": 10, + "global_step": 310, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.07968231571456e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..52014d1ef47fec2bca319391104a40b812ac2ad4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fb9a1857e2b0c31a375173c41ce102d6044c4841c2118bc8102d8e3a86674df4 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..067c2600d7afe7d0495a42ed02bcc31e2adc5584 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0ead62d6a8a08e30af35e9ed0a25d86a0de1aad8311803df4179c9dc01105e1d +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..f5fbaf3739704eea759ab29b4b9eba0fecf79ee6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4f581763059f9808c6971d543bee5e034fff1a9ec174cb7aa232dd9f17099da0 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..0f02c233f432413573681087f8ebce358efeb676 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ae2093149925b534f5c60211635bf0097e5b3bf50dc856b0e3f5b17717e52497 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..3eb4a287ab13369f9f7ea331269002fffdb21e09 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/trainer_state.json @@ -0,0 +1,513 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.266666666666667, + "eval_steps": 10, + "global_step": 320, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.24354303557632e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e69596438e8d47e8f1d88d445b3ea10b3c4fa8c8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f11283c3694e2edec71023d381f5c8f820d1909db44c51d2c6291aa53f3033c1 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..1c81fcc4219659564f5c1be0b8343a70c70931f4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a27a715cb26295c1e41aa66199760b306e257b78778a2ad42f8d95f81bec4270 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..759bff60bd0897427bf9d4410df520d35fd20081 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:389caf1bb32aae3a751e11d63ffe273f089df59490c4ac6e5883d944b329df0b +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..2df061fe83adad240544d1899eb2e5e2fb23a555 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:22b896fe763bef96dfe0d570de4fea5d935b3bf80de3a9b1b2918efca334b093 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..52d3374dad8e0b20e31099f35b0904ffb2c0b54a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/trainer_state.json @@ -0,0 +1,528 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.4, + "eval_steps": 10, + "global_step": 330, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.40740375543808e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..abfa589ba116f57d9302fefcbda8fd9e4c2e39bd --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4b8e3b4109bbcaeca4f2a8099fbfe5600d1fb177cc57c555dd0874af5ccb98bc +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..ff8e9b7bd3b6cb0fa3ffe6e59c1643e6116fb0c2 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7c10fc5295f24ebfd219d87f540fd0a4dbad089b8d1da8b732ef098291f73692 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4d7fc830aabf2c4827b0609ed6e355d0fa80523b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b904f845552beb994fcd34362e728f918c7473ac27288d463195b51c3ed73bff +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..28521d181a67af05811165bf7cec3a0fcb49ae9d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d03bb25f48f188323d4c5dda872d760e309dedbed641397ec2ec756835c29ac5 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..d49de81fd5de31867c32655fb91e071d1b109694 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/trainer_state.json @@ -0,0 +1,543 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.533333333333333, + "eval_steps": 10, + "global_step": 340, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.57126447529984e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..a30341d7b3d2da789213c82bed34883670ac0ca7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:54c2bb3f6fdd2b337b7881400f2603b8da2064508db09cbb312209fab77d34d7 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..d647c192dec8ed8395020972bef2a0aa7ec737c4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:63441dd73bd4a09b386d7231a25c0698f610c2fa488239d1efe2e8d48169b9ef +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fc3bb37d365dcd8ae3528d8e7242f7d2eae755b3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39cd0c0a4049d541d90e7c6154cb21167a341830884ad3558195617942678446 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3fc1a8fc07398191149b701be855b2b30b04d498 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8b8cfefd46d2412b7b17da7d799f9e9021312d0b294976f3e87f7063aa01557b +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..9272400276cb8af9cc07fef018ec6abb3a221a47 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/trainer_state.json @@ -0,0 +1,558 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.666666666666667, + "eval_steps": 10, + "global_step": 350, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.7351251951616e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..fce865d769207ca4ad57f5c240fdcfc5edfbf74b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4c6e6abfc9a55ca36b77c4f21da23c91fa0298c48964952f6184d4db14b3cd0b +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..45b3806f3f3dc0830b05645027cf29bf1327f6d3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:159caa3ede50dbc59ddb78bdacb3cc46459ef2c038f6fd91cf92665be124be1b +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..dff7e422d3f8fc71ea77fa33b28878ffbe8abd43 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9d73d43b628bfbe3f56e29099c04e9e9584349f935d8148aa8c34849bf03ef49 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..fc7191e0d24d86be98ffef99b67fe56b52160821 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6cb47a43082c3958508d73a1bd58f111764a18725005ed6a37a8d99585cef386 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..0ad0ab04e0db3931514fe9a3a41a8982e7509026 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/trainer_state.json @@ -0,0 +1,573 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.8, + "eval_steps": 10, + "global_step": 360, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.89898591502336e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..20a6f53dd99caf3417455d8b7178dfd5e2a3e0ba --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6ac0b072b28c3b0c45db37423748feef739f18f3e4a4afad72ffc5a0e0d66c42 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..62b6e65992dc113c04cf6d4ed0e4e39285990261 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b93138f3fc28bd9387875e9d7cd7b38852b372440f2cbf564d395cfb83b44497 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..792417d4c800bc4c8f7eb21d5421678309a6165b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d7c0e313f3d6f9e1adc7603b9ffa6f0ab3438f71ce0c71bd9a788485d02b981c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8bd8e28f2af2a751646ea36889854c5eda0b2292 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1511804f46c0ca65fb38b3cc2eecf2ff9872408b4f80615834923e731745685a +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..9b62aac6ee0d78e0d023b1c9bedd37670fce9bb6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/trainer_state.json @@ -0,0 +1,588 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.933333333333334, + "eval_steps": 10, + "global_step": 370, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.06284663488512e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..c7fa40d409ba884af2bdc985a0db902485b65546 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5f08bac9163bac1ca560c544bc0cec5240de291010ad0da3734b0b3ca9c63cc4 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..94d7d8bbee48d5e6745e1f30b5837d3f59f918a1 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f32b7d2c6100acf5c0106c871bc73140f735e315439a01bcef28844106817cb8 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..f3b952e81c9ed8c37528c0b9d4c13811ac0b62d3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d5ce5744fa32738c65fe7785ec589c49d96370233c9386567c3f06dceedb5f2c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..d9c94a2d554cb9176e1f6452c1c8064e701f6c9c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:69cb8b9fe313cb48c89565a287ca91c45004877815ee7660be6b701d2464119a +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..d52ce0b036e0ee9a2e140d4480d93ff8d2891771 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/trainer_state.json @@ -0,0 +1,603 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.066666666666666, + "eval_steps": 10, + "global_step": 380, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.22670735474688e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..5c29af00298f038dd705a0689c3ec024970a64c7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:02946de4fc1676939f27df4501cf078ab1327397efa18ba4acacb8ba8b7c4883 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..9c2bb34ab4ed6d18b2846ee19556692fd13a9e5c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1e2c6c4a9a34098824156fc21d9ed18611d191632ae7646425f3abb51c8054d2 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b458d8885e612e71d79c420d6ca3a40dcdcf7fd8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f47a6a8940dea009f3b7ce239248233dd458275df17acc4fa8ff99eb346e8979 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..bf1961280088992857ddb8fe8d4584423c44edf6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6bf472a6dc646995e9eb3a1b728ed47b4f764790f096bc535722b440312b4b49 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..ac3c582b17fc25697561a96d0315e8f62a3d5f9d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/trainer_state.json @@ -0,0 +1,618 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.2, + "eval_steps": 10, + "global_step": 390, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.39056807460864e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..abc0349278c15036d3c7f716088a22070f8e2d88 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ffe4adf1d52da24ce2d0152da8f268cce278a8f57d87bd2a6a38546895e886a0 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..0cfc0320268a13fcd3cb11d2325ab8afb43c42a9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:79c93de84f4dfaa278438b2a65d0d5624269fb413ef8f46d155f629d701e0535 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..cc0cb9030af17e56f3ab00fc0ad6850b4636069d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b5fde33a4ff115b0a519c0ef179183e0540c837c91cce3dba97312fa8e725570 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..1159228ea69439db76026731513cf5c71e57f3eb --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5f953d62fd365ebab5cb8aad6e7c0cdb075e95f55a4cb36b4f4e0198710f2320 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..16fbb99d76434ad0b6164ac1a64aa556a16b4b61 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/trainer_state.json @@ -0,0 +1,93 @@ +{ + "best_metric": 2.4596850872039795, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40", + "epoch": 0.5333333333333333, + "eval_steps": 10, + "global_step": 40, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6554428794470400.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..4218959c097a8e3d899c43612545630641f48332 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4832794ca7f266e9e6b8f6e67c58de8517713ca122623f1022dc9a5e909011d5 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..a574a0776a25076dc0f7dff82902f7cae2b9db25 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2f1dab84e4d5d43822eaa673d47463dd10c33779df99356626d2992292f63c91 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..d06e3c475517e0d14c13a6ccad84a3f20110949a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:96f529f9856ab8a411ac6b8078e33cfc18c0159c4947cd8cac8e1238fc1754c7 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..f91719d8a1b8836b7155587d155c2b2cfc9c7e48 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bbe59d4638e3afc1c337d3e4814ea99d33c22eec7bbc39984af69898855ffb2b +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..1922c2c7413815f2081d7e9545153d2f5e168229 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/trainer_state.json @@ -0,0 +1,633 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.333333333333333, + "eval_steps": 10, + "global_step": 400, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.5544287944704e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..0027e44443b7833c2a9f6966fd2b2439ea727aab --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:798575e1bb867e4f126c5ea0c7ccc9148eb608c0e362bf03fc0cb15086be3a89 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..40405d50d770274fb4635d7dd22e91629dde3713 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b11db5cab25b10281efa9e37939a312066bfa801393807a591f9c9e43923e742 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..090a1de878697aa3e6255ed23ff26ce6e561a9fa --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2cab01f3c0a9d66cf16eec91d8aebbfd533628e45bdb849b4c3e4ad317f15270 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..390146116c48e62f4426eeb3a1cf7a2ccb90f69b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:72422499e547842d9c164e7afacfea53fe3941a7a106527c3755c473fa91c799 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..e4352609b37b1540d4632032bf20a3d3cb7784cd --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/trainer_state.json @@ -0,0 +1,648 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.466666666666667, + "eval_steps": 10, + "global_step": 410, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.71828951433216e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..04579d8993695b94a93de889767ea804cf5517f4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2579cda0121f3eb33471b44a7b00be80976b3cd55d5a8dfeb366c9fae1fa85b4 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..733bc9671dc11d006bf1b5bd2002207a99d26841 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d7001f7a3ddfa1806af643df67964c08f289d4f42535a5f71671531c680c612a +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..7c168ba589ab149907f65c12980a55da76890995 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:02f02c3c7264962c7bbb05c73c2c2f9530a34cf2c29d550cdc787ae19eb6d9bb +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..6b66301b7ac8ccf1308c1ac8d63d7000259489d4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5336fb81030d9ecbffa34471d17a4c3981e781c865d7ff7a9b59e360e4230577 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..8c3461ef63617206c966020271f7d8a87228a31d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/trainer_state.json @@ -0,0 +1,663 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.6, + "eval_steps": 10, + "global_step": 420, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.88215023419392e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..d1b61d83245989555ff6e37f6a0f654f45cbd897 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:eafb6dbede415008098a1335576939e3f23557b5e3362235181f72c83df45198 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..fb81b2e1605bb6cbd5c9a3e0e7aa4a83e38bd1d6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2acbf26ab553b968c499a1068efa4fdedf18656a3e9fe4d648c32040e39edc95 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..eb08c850753d158caff59458c0a4d2fa22ad5de8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5f5c1faf0e9eb010c64f51b35236463635709da903fff7194839666558e862b6 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e44e26be68a19106ac45dab84a43a732acb91528 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:579c34af7d7ec0609fbd3479f4f8d8571c4cef90c76d9f6bacf43740f58855d8 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..441cd3692cc9cff8f041ebeef0781cd230d4c9e4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/trainer_state.json @@ -0,0 +1,678 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.733333333333333, + "eval_steps": 10, + "global_step": 430, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.04601095405568e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..76ff5d371155cbd5da6e01413a4013b66b1b4951 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4841e8769fc5597cde4504e28b5594300f645ff8325a567386378016dfcae8d2 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..24c6bd45a24db72be9d35bbe59467530223d80d6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e0ae4a28e284f4391a6161732674ea3708aabe7d8ff95b4ca8f82df69b1038c0 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..5fdc5e50e381540856fecccc6c375074d1aa7b0a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:54abee51bb88479cda4bf77e85c2a545e7fb3c5e42f56d1baa63f1344dcc0529 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..f8f2b85f23363ba098112683059a3e46233b6bfc --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:01a563f529b13f402d286b14bda74d3530e1fcecb2bee786164bfa1339da3729 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..1f439eb19f61a2fc153694ac655e3c795e0c34b2 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/trainer_state.json @@ -0,0 +1,693 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.866666666666667, + "eval_steps": 10, + "global_step": 440, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.20987167391744e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..7a090df4e4c186c5470b9cdcc778922c765cdfe6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e9d5266fb2fe76cece7c27d6e9344c53e582e8e9349415f577399917c424f564 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..0b37dec24f8ae8580ac43b1a5c835aeb9e1c2fcb --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7690fc8601db3439092f6f5c29c17b4d6092b5ab3793dcd727f2f1feb0732ea7 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..3e7c44b011328e871a23ca1fea7cc6ea78d70a29 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4cc0a8131f9f14b855b33975c5e795a94be3a332a0f3cf68a9ec3ab6ce73b177 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..671e99d731836dff5ed479ba9e24ab368c795616 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bb8360cb66be4e8be27b2f376c800950e3f00449fb6491d6247165f9aff23820 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..8e2e862c20771a4125f104cfbb7154a6f7512cb8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/trainer_state.json @@ -0,0 +1,708 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.0, + "eval_steps": 10, + "global_step": 450, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.3737323937792e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..6d3ad8a522a65e1d4c71a310998368b87eb732b1 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6819a2a5a02d6b6dbbbf7c248791c04fa105625561e44b195a5b3b79a15eef34 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..c3dd352d92d6083a5273e775c45d8a7c421ab4cc --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a1bbd184745c8ff31a7f77c97d57a13e73090e0c1d77c0a62bc898329e1b873b +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..82f7415495fcd1c3ffb5dae79c8c3a4c2269faa6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9a6424cc1a4d391795fbea6a94823363dca21ce0e7ec6c433e8cb5b0aca0060f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..82f0764fa1ca7bd5d0d2c27e699e54f97149a9da --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:09ce894ec673ae7c851228a15e2e8a3dfc488203c01cbf434a7c4cbec9b7becb +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..8af3007e57f69ac74a0d694ac6780e487fc2533d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/trainer_state.json @@ -0,0 +1,723 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.133333333333334, + "eval_steps": 10, + "global_step": 460, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.53759311364096e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..716c8d492b86fafd1030d92da4172aaf47e9114a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6ae0dc5d444b33314a1d7046a8fd3e72b541f81d37548e29b78df893686a3869 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..6fd1c7dbcd579515e9301b2152a527da51b262d6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bcdd71540ea1a254394563ab4694ccdfe49759c180605db1bb693ee138e4ceae +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..84ca1f63cf231e2aa1c43b465c46ef11c80bc867 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:03fc4a1860f68759a4d7833f4317681e377d4e71cf91ab1f091da8cd71579d26 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9e4c1530ac9944d4b54caf372d4f9930c6597321 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ee66a0b6b4d05213664fc79a1ffd83a3bbefdb7154906787c3ef06bfdc4539f5 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..4ff1f45443ccf1ac9fe450593077dd6a70ebc4ca --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/trainer_state.json @@ -0,0 +1,738 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.266666666666667, + "eval_steps": 10, + "global_step": 470, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.70145383350272e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..01bd5ab056a70ecf6d8586fe104cce8e1f821a07 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:40eedd7fd90153ae12877558223eede523f3a212b9d7be2018bc410646c18cab +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..36a41f30421896f1fe8697fe26949851988bc51a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b4b0813f0899b02a92f8bb715c81610d1ad92961fd60337c87a442c33129c9af +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..302025be6f88ae472170fe5d230ba39d4ec976df --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:918d6ec8ede8d7a880512e2fc44b16d7c22df85e8b411a004d142edcf446c40d +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9c7a583c3e236b2f110dd12004cef1d9a2b13311 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:535824b66976a8cd20163034000bf2ae1a203551ed6ea6132858b6421f4024c0 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..48bf44b3ab6c8c4c5730958c44714a151c6d8b0b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/trainer_state.json @@ -0,0 +1,753 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.4, + "eval_steps": 10, + "global_step": 480, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.86531455336448e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..76f23fb2c4877e6e756dae9eae989925e47ad422 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1892395ec2331f0e5e0c3d567a70245a04f49a110a12195047c1cc32f6348a45 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..1bd52c57b20cc7cb637637786171f8624f6d184a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:02aaad4eaa9db1338164b3b54abb52160f213f9b32edb3de0ac7d64362612d9d +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..031b265de35950a615eacc2c86e46292f552e541 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b56a3ff26dded8216d560cf73ba4817b5973851b78edbbf6aa9d6b515761df8c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..5c509b1230d4d9d9bf05bb1cf38bcd2d3119d2c8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:65f3e63eff29379b2f31d4f746c0c715c2b686bd11d7e07aba3d5f29231a18da +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..13f6fbb64627866771fb570d406cd39ced48a7ec --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/trainer_state.json @@ -0,0 +1,768 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.533333333333333, + "eval_steps": 10, + "global_step": 490, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.02917527322624e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..5ce69f3aff9d5a7e03725dc1d29188c63307b118 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b9925e3e0ec98006322e7602286c69e8ac6b6af4bc6357b1bbfb1f42597795c5 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..9f1e16bbc21982e0d5e2069f6486e5c30cfc5357 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5606ead4e7c5368c4eb7cb2f8d9f709bdb604892ab526db3ccb602e42837c867 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..c1fc54eb4786e9f15244e8e4274b14688b87da5d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7062fa0264c6fb17100531852b46c235ce631a6626d5e19749a65ba8723532c0 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..cee24f7781db565e483521e84ddc6dd277a07ef3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8f79415c3ece613ed89d676bff22f42086790a2bced0de6758824fb8c7e27fcc +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..e65961add8483ecccbf0e3f3c2032d3a75cd93e8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/trainer_state.json @@ -0,0 +1,108 @@ +{ + "best_metric": 2.459153890609741, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50", + "epoch": 0.6666666666666666, + "eval_steps": 10, + "global_step": 50, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8193035993088000.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..491640a21e21ed55fc4a99d14f059b41f4f503d9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:57f1310ade2681c8f28039c9c9351c4bc3aa9af17772142ad43fe75ce3c1bdb2 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..02fc19c2aa0a4bb6049cbc75f57e6c59a7e02ce2 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6867563696a335b2a0bb22588baef00bd3cebd1ae0115fc3ef5457dda7938c31 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..96edd96602542afab3935d537c8d1428ce43196b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:beda198a64f1e6f1db0895ff6a6859c2af4c98fbf9c15d1daa4dcca9c20f50be +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..36002f421a8027f0e22e1cea8d6c317eebfd0e2d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e63d56828d52c149ac34c43bdc2adc48c363068c94b9a3df26528670b68d615b +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..a30e0242526da8d177b8e0f5db3b42bfdfbae3d9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/trainer_state.json @@ -0,0 +1,783 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.666666666666667, + "eval_steps": 10, + "global_step": 500, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.193035993088e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..d8259af65bb37a20f873025cfff5c106ce9c308a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1918d597dfb37ac48cf08573fe263a2fa43e3eaf58814ec3e6c5151e27ac5a90 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..841d5a5c356c45e4c74aa59dbefde6a6a31c57d4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:216e03412440abe470ec319f285204cc01b1319950a79b5a83f5cc7ed932342f +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..52b85f2bd42c764f793cd9aa8382577ad1b51617 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:156b16fe2af6b1592b431fe36919ba4914ab9e672f318f884f5045be66654277 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..a5298f7a45852e72ab3264eef95969ac26ee5012 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:14a2456b0fb437e597f1bc67f02d12ea64caadba3ce80e5a7bba56290d13a10e +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..4c0d52b95d1ff5d6d9fa199889a5c64924cffcd7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/trainer_state.json @@ -0,0 +1,798 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.8, + "eval_steps": 10, + "global_step": 510, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.35689671294976e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..9c54c873b4cc1a7471aa4bc9a56c9e1258fb35e6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ce5a650b8c2490ba0319eb92bc86c84301344617c5b2b3180fb8dabe031c6876 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..df34147e13e22fb275668af3971d2eac6d3775e4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2d392ccf4de56ca6340d89eedadb490cf71fce0984572c55d184f797cb6fd173 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..736afdcce42e3e1d5dec3aedeed239bc0b63975c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ca29f15bc2264125f00923607dbea007ec921af3e528271a2bb77db5cd4d2b66 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..d470aca3bc75a59cd83f65a7641e2227523184b0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:54453f7799a2c12a65729e49535ef0d1133252bbba34418ca96403f477d1ed92 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f43bbc6757d64a0b184e36f456883bd683fc1814 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/trainer_state.json @@ -0,0 +1,813 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.933333333333334, + "eval_steps": 10, + "global_step": 520, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.52075743281152e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..430e0e50cc972fd9733acef1cb4203f307fb1717 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6cd05200bf80e7f2a3ed7990fea88db4853205f4b184ba037bf347374d95d379 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..5cc6aafa75227648470857159fb2a8af3d52b53e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:aef8e84b853f54e61bb58851eb4162b895b8199aae3e533da9dbb54f05cd6809 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b0413aa128dc89fb63c7a74242ac1a6da3ecf5bf --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e9436217a6dd3838565d7b9845d97ff2e933eb514cc6ac99465ebc3448de3312 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3038016ab1789281fcb7570057f9ac7ff03feda9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:67026c5b7b6af0a730215316d61a8dcdd8b26b784be7a50e23105aea365fc01d +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..fffce1c78c27384cdacd0993c696904d0279dd13 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/trainer_state.json @@ -0,0 +1,828 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.066666666666666, + "eval_steps": 10, + "global_step": 530, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.68461815267328e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e5a9adea844be57df29399543e06c26f357e1c47 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:643457cb9bd8924b8761375687ad41c32392ffa6ddce21e65dd5f40167f173dd +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..3d363d3f279e3b2868713ecc0dea9120162beccb --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5f9c4313f7cba0517530da214424ee1cded6e7f18df40646995612b833e65c0c +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..8d48caf21e655a01d7675a2b465c934cea676943 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:816bfad4f86e01da7fe3bd5bf7d10c902cf135a5b5fec9e0170158290fe5828c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..4191ea1c11397f76dbbb9677283fd3b541b6e689 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:61e7bf31ab25b6a7b2f0902a2e1f6ca5545ad296580f627246378508da64fa41 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..4d113b585c717e299568cd7c76fbfdd3c02acd63 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/trainer_state.json @@ -0,0 +1,843 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.2, + "eval_steps": 10, + "global_step": 540, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.84847887253504e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..3b4b81809e8d36b90071f5c57f7b14fb4afe937b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a5de23cffebacb0d33f63c747700304fc1a03a0a4748f773ee60ebc6433b1b1e +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..29831e00b8e0f901d3b1be73e0b68c4b9cf0d2f7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:275016cd8123251794c42ddbd38ac668d4e5f618df182e31cb5c432ee1252b35 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..9dc1ec111f2a6f7fbe8d878013e83df65b5f618a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5b6faa8c50c89ce52c86274c8c795afb3f00524e7aef4544572df4b5b6b12c6d +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..f59d0454a2e540196c447dc81e215fde49e60f8d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fe5b11bd9034273a78668f95788292b87ad00f4f53e9e4864d3471380b5838b8 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..598c35e2923c7fdad884be91904d3b67496856bc --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/trainer_state.json @@ -0,0 +1,858 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.333333333333333, + "eval_steps": 10, + "global_step": 550, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.0123395923968e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..b1188c240c36987936b1bc7e21dd030ab92f0575 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:87124fcb747247c088582f6fbd45c09782441cf70651fed5d3eefd4ee3defdba +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..7f197d81927afef3adef1d698d27fe352b349c7f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5db85948077f97488f484293a7861d2eb91799043d404d3e851b29c3c18784ef +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..75311ff97c8628cb71fe6f6cdca5e9e1127d30b6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3b6745ab2a92f54dcacb73c3ceec9d54235e5b225134fb7703879ee6185ad897 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..ffa4f4faa1638037c7009a7874a8ec2f958a56f3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:01b0d6dead233f71cda974ae02165d32469a3692fb9b97739fca51d1798a012e +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..8bcebf145b2c96c49628addf6657d1fc9ddfed6d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/trainer_state.json @@ -0,0 +1,873 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.466666666666667, + "eval_steps": 10, + "global_step": 560, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 4.051950931549072, + "learning_rate": 1.362962962962963e-05, + "loss": 1.584, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.9179320335388184, + "eval_runtime": 43.8601, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 560 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.17620031225856e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..c49af296f0a1fa3b091d9e7525ebc4d70c049979 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bef30121ee6925825a368e65057003596f975f3151024f470d97f75fefc8605a +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..8b63a6759219b30fc864a6d248dd233b392de08d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:101c3bbc6f0269ba9f31b793013783c14e7eb70e16794d59553bd31648e80b75 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..3ed38f9a78b3dbf6f2e73e5bd68681ac198b1983 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d966d92a47b281ed57ee7f44ee2eaa60a54786f7ca9b7e8829ab8723bc8a5a1d +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..0f6ab5a4a6c1c8537d29396d68ff9a943067c8eb --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6a34ac3e6b737b225204c7a1c95f58427255f84b0986866cdac344b9d5ba4319 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..5f10102d160c9380b21855a640981cb5565b6e33 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/trainer_state.json @@ -0,0 +1,888 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.6, + "eval_steps": 10, + "global_step": 570, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 4.051950931549072, + "learning_rate": 1.362962962962963e-05, + "loss": 1.584, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.9179320335388184, + "eval_runtime": 43.8601, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.176218032836914, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.5762, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.9176185131073, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 570 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.34006103212032e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..ce5399b082c7bb73c091a0427b43d34ca0b52610 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:609fc131e5086765d57a69343466ef146e5e0adc59821ca9d131269d4011c6f3 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..392f4907d8ef236a2858b6e9bf2fe15d424ee877 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:09e60dff3ef7c2f1c2dd8495b2cac2e4e040e04a346aba9a21b7ad37760db335 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..6f12baaba3ec135e726e0b75dc20ee8cfe8a995d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:55a6ddc6425602c9554969e2910a1ee66847f95ab8fd86352843e16c6530b2c0 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..df177b9452bbc35cb78b91089f310520fe740b94 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3aa3352ae201120fa831c764f5b07fe3f9aa427e68763e4c88ed9af407727f22 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..991b881d6cdf5ad892ff9fed11934ac5987c2474 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/trainer_state.json @@ -0,0 +1,903 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.733333333333333, + "eval_steps": 10, + "global_step": 580, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 4.051950931549072, + "learning_rate": 1.362962962962963e-05, + "loss": 1.584, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.9179320335388184, + "eval_runtime": 43.8601, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.176218032836914, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.5762, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.9176185131073, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.1765544414520264, + "learning_rate": 1.125925925925926e-05, + "loss": 1.6509, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.9212377071380615, + "eval_runtime": 43.9108, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 580 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.50392175198208e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..cc1557997cb9ba51b889390010eefd1a004c1547 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d0cb1174370ee8a64b24c0b55b5329d4743da9e4727c99a8cd9703f203e22ae9 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..181fc525fa40d8dcf78c54c4b407c8f885bc1270 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c1cf898aa9f328f77eacda0e09a9e78c21b7786ea1da75da50762bb826a6d16f +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..f2cbe02e4922a4920c0a827f09f6df580967beb0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c5704b322a17ce5b2788c1247543e3ca9edc36d083fd8ecc8ca80d04334c6030 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e93107ab0b0cdc649d183c879754ed083006f9d7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c640cac3d338c5c53c53ad351f9ec822b97e3962fe58e3c4439d6cecd03512ac +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f14da48865898d0fff95f099902470ee80dad647 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/trainer_state.json @@ -0,0 +1,918 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.866666666666667, + "eval_steps": 10, + "global_step": 590, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 4.051950931549072, + "learning_rate": 1.362962962962963e-05, + "loss": 1.584, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.9179320335388184, + "eval_runtime": 43.8601, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.176218032836914, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.5762, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.9176185131073, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.1765544414520264, + "learning_rate": 1.125925925925926e-05, + "loss": 1.6509, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.9212377071380615, + "eval_runtime": 43.9108, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 4.143585205078125, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.6004, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.9180777072906494, + "eval_runtime": 43.9213, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 590 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.66778247184384e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..46fda3e17f14eb3d672005e987ac3fd45f6e6d96 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9c2e4262fcb78e166de154c8e0a28c3e1b8baac7b77e8edd00a50be2c86aad53 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..443926f8531fb58771304e6825c36d0af948f93c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:230a6da62611ea433da37380b45e0a7f93e5a4755414ead6b630041eb3466ff3 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..3d041c10a3af80c2be01488b87e7c23a107acab4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:224b98cd2a3813f8f156af229101dde99ced2e24294f3d7ad7b1538fdc49c27c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e35866f32db88c57fbcc281885df929786abae39 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:db64dfcaaa6d2770fdeb8c6c250f6efda7e6b2cbc236d50bf153703fcb63ac50 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..02e2756c8a48a488eb202ef3f3fadc1bfef51ed7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/trainer_state.json @@ -0,0 +1,123 @@ +{ + "best_metric": 2.458838939666748, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60", + "epoch": 0.8, + "eval_steps": 10, + "global_step": 60, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9831643191705600.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..ed291143e06ca88e0a2f4ea3b1f3d189e76fac23 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d104a2378a785bf677d194286fcc08d1c063ad8ed37399cd2a8e49cb0ff09106 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..84d8c54963c8e05744c022b7227250e61bebf0af --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ea59547aade4e7b1d6af541bbf1341174f3a01ff82ab18c998f2a047fdca731c +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ef40b259bc3233779099c3b8651c2fe0a9d07fa5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9bbc772ea5a37ab482a5fa0d13a2014584215ee3da6246ff6fe50fb8dafbfb8e +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..b706276d2ff18ccd83310c61d87eb2ed9fc15f80 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:efa99c3d03a71b7b58bf8c6b52c8cd63b4d6a19d88cbdc8dfd20580671d183cb +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..d674b03bbf5ab229d2867807b464d300522fd85b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/trainer_state.json @@ -0,0 +1,933 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.0, + "eval_steps": 10, + "global_step": 600, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 4.051950931549072, + "learning_rate": 1.362962962962963e-05, + "loss": 1.584, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.9179320335388184, + "eval_runtime": 43.8601, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.176218032836914, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.5762, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.9176185131073, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.1765544414520264, + "learning_rate": 1.125925925925926e-05, + "loss": 1.6509, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.9212377071380615, + "eval_runtime": 43.9108, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 4.143585205078125, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.6004, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.9180777072906494, + "eval_runtime": 43.9213, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.576042413711548, + "learning_rate": 8.888888888888888e-06, + "loss": 1.5967, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.9157402515411377, + "eval_runtime": 43.9479, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 600 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.8316431917056e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..a71a2047543a843a43707085362e5f7e4f82f142 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e5b575c45a5fd0b4c9b4db779ed40159e43555f8597a5713308bfd7e21fc7b9e +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..086fbab2c2168a48eb91cb34e9d52aecd8edd464 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1cb5e8e2c21ec6eefad0d7f3b2490fa65ec89b8fb5cfff5780a54aa1a1ab1ac5 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..6a970899a5edc16268fdea83560e0495a3d06810 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fa5b53289977451ca52671d3897055616936322daf22f6e4246ff72a467aef1c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..9ae36545a6cbe48ac387e9a4edd1288050b062ff --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b0de73605756aa391aaa9ea36adcbd12bd865860a2561b0aaca0c704b25cfe02 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..8d94226deba80d6db961a90068d6cd8ddbec1a3a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/trainer_state.json @@ -0,0 +1,948 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.133333333333333, + "eval_steps": 10, + "global_step": 610, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 4.051950931549072, + "learning_rate": 1.362962962962963e-05, + "loss": 1.584, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.9179320335388184, + "eval_runtime": 43.8601, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.176218032836914, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.5762, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.9176185131073, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.1765544414520264, + "learning_rate": 1.125925925925926e-05, + "loss": 1.6509, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.9212377071380615, + "eval_runtime": 43.9108, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 4.143585205078125, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.6004, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.9180777072906494, + "eval_runtime": 43.9213, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.576042413711548, + "learning_rate": 8.888888888888888e-06, + "loss": 1.5967, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.9157402515411377, + "eval_runtime": 43.9479, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.2758278846740723, + "learning_rate": 7.703703703703704e-06, + "loss": 1.5504, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9387192726135254, + "eval_runtime": 43.966, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 610 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.99550391156736e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..d82e946a25b1de06ba7e7daff4f9243dfb66a88d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:903b6fc0dc99d3a435bc431a24fabe54d3b0850c878ad86d668cb88a63232606 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..33a3aa7f4c39379f3626ca0673eec417c4885a11 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0c1ce29aa4a9dffc37a762d13a32290f25f4be65b5d4523bc29de48020c9ac54 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..da7e5f0f7045f8fad1c1529974e555cc67b8f5f0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f2b2ce429e00eba0165cdfd527b7ca384fed68ae5660561d0cbc6dbdd51ce7f1 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8f09b0521c27c995f0878cde37cf7b4138abd8e6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0cecac504a0d6e20c848bc43265028cb51bdbaee46716ad0736302cdd3a2376c +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..5019663f9a8fa879aa76ab7efe39ce5126b1ab9c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/trainer_state.json @@ -0,0 +1,963 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.266666666666667, + "eval_steps": 10, + "global_step": 620, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 4.051950931549072, + "learning_rate": 1.362962962962963e-05, + "loss": 1.584, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.9179320335388184, + "eval_runtime": 43.8601, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.176218032836914, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.5762, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.9176185131073, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.1765544414520264, + "learning_rate": 1.125925925925926e-05, + "loss": 1.6509, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.9212377071380615, + "eval_runtime": 43.9108, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 4.143585205078125, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.6004, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.9180777072906494, + "eval_runtime": 43.9213, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.576042413711548, + "learning_rate": 8.888888888888888e-06, + "loss": 1.5967, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.9157402515411377, + "eval_runtime": 43.9479, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.2758278846740723, + "learning_rate": 7.703703703703704e-06, + "loss": 1.5504, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9387192726135254, + "eval_runtime": 43.966, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.234650135040283, + "learning_rate": 6.51851851851852e-06, + "loss": 1.5486, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9641079902648926, + "eval_runtime": 43.929, + "eval_samples_per_second": 22.764, + "eval_steps_per_second": 2.846, + "step": 620 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.015936463142912e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..c94c387dd52842e3e86d3c2f75e2e1ab9048a0bf --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:de6918c6818af8cd78ebedf3a22d666dbd90fcdae2903e71144faea1a62204d6 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..11ac5495f84f8a158c09a8f97ebef38f697bbdb0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0bd391c610134d6b231b8d638c1460f1fd8e23f7186a91f8916b31c5c23c8a16 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..96d7a3f6be074e46014211fae837a521e5c5140c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dd6c4f62bed5401eddcf930d960632a48c624bea715ca64cedd7d04db198b4a0 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..31b434c7d46bacc0a45ac73d9e6264e373e131cd --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9d8d4a20c36091528ac87a7edc5845454d614d78ad71a59c7a4ae563b2fe291f +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..2f23af4b90bf0bd50e061ab9db4123ae3d14b720 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/trainer_state.json @@ -0,0 +1,978 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.4, + "eval_steps": 10, + "global_step": 630, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 4.051950931549072, + "learning_rate": 1.362962962962963e-05, + "loss": 1.584, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.9179320335388184, + "eval_runtime": 43.8601, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.176218032836914, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.5762, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.9176185131073, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.1765544414520264, + "learning_rate": 1.125925925925926e-05, + "loss": 1.6509, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.9212377071380615, + "eval_runtime": 43.9108, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 4.143585205078125, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.6004, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.9180777072906494, + "eval_runtime": 43.9213, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.576042413711548, + "learning_rate": 8.888888888888888e-06, + "loss": 1.5967, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.9157402515411377, + "eval_runtime": 43.9479, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.2758278846740723, + "learning_rate": 7.703703703703704e-06, + "loss": 1.5504, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9387192726135254, + "eval_runtime": 43.966, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.234650135040283, + "learning_rate": 6.51851851851852e-06, + "loss": 1.5486, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9641079902648926, + "eval_runtime": 43.929, + "eval_samples_per_second": 22.764, + "eval_steps_per_second": 2.846, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.6264803409576416, + "learning_rate": 5.333333333333334e-06, + "loss": 1.5429, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.9659907817840576, + "eval_runtime": 43.9159, + "eval_samples_per_second": 22.771, + "eval_steps_per_second": 2.846, + "step": 630 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.032322535129088e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..3a3273d786194d2cf0f8d8ebdf8a36187267a0f9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:933ffde5459d3c61652501bb66195f752a5129af487c28e415672d979d65b052 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..7fa25e4edfdb890c1af540da81757bf36ae2fe12 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f0be097f6358bb8c934021d88a4da303affafbe3f204e569d164178c39ac900b +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..bc02fa7e506af341c87e94bd62a6cbdfbd057096 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0597f3b9ac321e002676eb1712670348770197d9b197cdd7a7e16f465315444e +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..cc14f7545324288e67e156d036369e2cebdcf74f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dd2d3d121570090627f59257118b55358f83f1b060f0fb11ab062387addadff4 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..4dda91a8a00b62e8e7262bc1ebe9390ee71b7541 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/trainer_state.json @@ -0,0 +1,993 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.533333333333333, + "eval_steps": 10, + "global_step": 640, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 4.051950931549072, + "learning_rate": 1.362962962962963e-05, + "loss": 1.584, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.9179320335388184, + "eval_runtime": 43.8601, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.176218032836914, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.5762, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.9176185131073, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.1765544414520264, + "learning_rate": 1.125925925925926e-05, + "loss": 1.6509, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.9212377071380615, + "eval_runtime": 43.9108, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 4.143585205078125, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.6004, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.9180777072906494, + "eval_runtime": 43.9213, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.576042413711548, + "learning_rate": 8.888888888888888e-06, + "loss": 1.5967, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.9157402515411377, + "eval_runtime": 43.9479, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.2758278846740723, + "learning_rate": 7.703703703703704e-06, + "loss": 1.5504, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9387192726135254, + "eval_runtime": 43.966, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.234650135040283, + "learning_rate": 6.51851851851852e-06, + "loss": 1.5486, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9641079902648926, + "eval_runtime": 43.929, + "eval_samples_per_second": 22.764, + "eval_steps_per_second": 2.846, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.6264803409576416, + "learning_rate": 5.333333333333334e-06, + "loss": 1.5429, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.9659907817840576, + "eval_runtime": 43.9159, + "eval_samples_per_second": 22.771, + "eval_steps_per_second": 2.846, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.2673451900482178, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.5452, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.9610865116119385, + "eval_runtime": 43.9415, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 640 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.048708607115264e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..93a08a7790378f620484f43133586a83a8ee7a72 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:416272eef282e6be298cdeb6563fc543d48f8a001e9795fa2c0312da13fc46c2 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..c56e03ec444630e2e310b48a472163d886d854b6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:18d9666ff1e54e5535408ccb8b6aaab05a8a5d79dadc40aa56d668c73695741d +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4d763156eb3a586b51733d4ec683a815a6ae5fab --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e66e316bd2615a5005aac13970f8b8e71830843ea716191e53ff7dc38997af08 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..75f881b3ec9ba86b1878709fb0af361a6f712546 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:31b02e0b7ebaaab7bf8f183e3b47970500df166e496df9fdd39405913db43e64 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..e74ebf83c022ce41d2587171f68b0d00a7564594 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/trainer_state.json @@ -0,0 +1,1008 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.666666666666666, + "eval_steps": 10, + "global_step": 650, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 4.051950931549072, + "learning_rate": 1.362962962962963e-05, + "loss": 1.584, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.9179320335388184, + "eval_runtime": 43.8601, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.176218032836914, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.5762, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.9176185131073, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.1765544414520264, + "learning_rate": 1.125925925925926e-05, + "loss": 1.6509, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.9212377071380615, + "eval_runtime": 43.9108, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 4.143585205078125, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.6004, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.9180777072906494, + "eval_runtime": 43.9213, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.576042413711548, + "learning_rate": 8.888888888888888e-06, + "loss": 1.5967, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.9157402515411377, + "eval_runtime": 43.9479, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.2758278846740723, + "learning_rate": 7.703703703703704e-06, + "loss": 1.5504, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9387192726135254, + "eval_runtime": 43.966, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.234650135040283, + "learning_rate": 6.51851851851852e-06, + "loss": 1.5486, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9641079902648926, + "eval_runtime": 43.929, + "eval_samples_per_second": 22.764, + "eval_steps_per_second": 2.846, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.6264803409576416, + "learning_rate": 5.333333333333334e-06, + "loss": 1.5429, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.9659907817840576, + "eval_runtime": 43.9159, + "eval_samples_per_second": 22.771, + "eval_steps_per_second": 2.846, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.2673451900482178, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.5452, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.9610865116119385, + "eval_runtime": 43.9415, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 3.4228227138519287, + "learning_rate": 2.962962962962963e-06, + "loss": 1.563, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.9621055126190186, + "eval_runtime": 43.9598, + "eval_samples_per_second": 22.748, + "eval_steps_per_second": 2.844, + "step": 650 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.06509467910144e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..20e618be3c1257c7454e9a103ec359245c15c748 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:55cf809a271e36518aa3da89f75594e2cba2517bfa30553165aa1d9d290668af +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..66ea10d23f7856fa96d187353a21eb7f62f4f91b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:207e52a0e307fad152e2a64e2e8a2b3a229ef071fe3d749fe5238a02ae2249e7 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..1bd0e24dcfea6867dcdb66e0b90f3344dbd9d339 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:66fa7ea9452d536e82e5c18c4a0a05615143763aa569d9af13553a06a11128de +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..403f6f78ce81468eb12e4e1c093d8452c7d5a14e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e9e3b3eb476269cb66006445e45fa57a95b9d6fbb9998ae81b82199f9b98541e +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f5e41fbda861e4d0a5a0fc51e715213538cefd9d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/trainer_state.json @@ -0,0 +1,1023 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.8, + "eval_steps": 10, + "global_step": 660, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 4.051950931549072, + "learning_rate": 1.362962962962963e-05, + "loss": 1.584, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.9179320335388184, + "eval_runtime": 43.8601, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.176218032836914, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.5762, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.9176185131073, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.1765544414520264, + "learning_rate": 1.125925925925926e-05, + "loss": 1.6509, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.9212377071380615, + "eval_runtime": 43.9108, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 4.143585205078125, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.6004, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.9180777072906494, + "eval_runtime": 43.9213, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.576042413711548, + "learning_rate": 8.888888888888888e-06, + "loss": 1.5967, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.9157402515411377, + "eval_runtime": 43.9479, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.2758278846740723, + "learning_rate": 7.703703703703704e-06, + "loss": 1.5504, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9387192726135254, + "eval_runtime": 43.966, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.234650135040283, + "learning_rate": 6.51851851851852e-06, + "loss": 1.5486, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9641079902648926, + "eval_runtime": 43.929, + "eval_samples_per_second": 22.764, + "eval_steps_per_second": 2.846, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.6264803409576416, + "learning_rate": 5.333333333333334e-06, + "loss": 1.5429, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.9659907817840576, + "eval_runtime": 43.9159, + "eval_samples_per_second": 22.771, + "eval_steps_per_second": 2.846, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.2673451900482178, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.5452, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.9610865116119385, + "eval_runtime": 43.9415, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 3.4228227138519287, + "learning_rate": 2.962962962962963e-06, + "loss": 1.563, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.9621055126190186, + "eval_runtime": 43.9598, + "eval_samples_per_second": 22.748, + "eval_steps_per_second": 2.844, + "step": 650 + }, + { + "epoch": 8.8, + "grad_norm": 3.652618885040283, + "learning_rate": 1.777777777777778e-06, + "loss": 1.5375, + "step": 660 + }, + { + "epoch": 8.8, + "eval_loss": 2.9623842239379883, + "eval_runtime": 43.9612, + "eval_samples_per_second": 22.747, + "eval_steps_per_second": 2.843, + "step": 660 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.081480751087616e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..703bc9a5f57c4efdf87075714229ef8b2567d6da --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f4b6706ed67d375ac24d8472744f8970fd964a8019f9ca78abca10185ae37b41 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..0ba6151279404a63dde48556e73472defbc4754d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:35c65442ef85c29470d979f1f7c097817770f6889612562402dfda67933e2591 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b50ed8357a00070f99a52843c3e3d150dbd5b1aa --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5bb0850ed44e50e4ccb2afc9aab9a80c17a31208454b069930105956f7f9a183 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..72ac93ee4249bf1220c3ed82f099c14ae0267a68 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:886b6be563b163a73eaac3a0ce905ce45ea5202bed173e897fec04ed18434edc +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..5fcea31bf83a3ca1c5f7522ba3fc5f5ea72feb1b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/trainer_state.json @@ -0,0 +1,1038 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.933333333333334, + "eval_steps": 10, + "global_step": 670, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 4.051950931549072, + "learning_rate": 1.362962962962963e-05, + "loss": 1.584, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.9179320335388184, + "eval_runtime": 43.8601, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.176218032836914, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.5762, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.9176185131073, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.1765544414520264, + "learning_rate": 1.125925925925926e-05, + "loss": 1.6509, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.9212377071380615, + "eval_runtime": 43.9108, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 4.143585205078125, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.6004, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.9180777072906494, + "eval_runtime": 43.9213, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.576042413711548, + "learning_rate": 8.888888888888888e-06, + "loss": 1.5967, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.9157402515411377, + "eval_runtime": 43.9479, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.2758278846740723, + "learning_rate": 7.703703703703704e-06, + "loss": 1.5504, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9387192726135254, + "eval_runtime": 43.966, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.234650135040283, + "learning_rate": 6.51851851851852e-06, + "loss": 1.5486, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9641079902648926, + "eval_runtime": 43.929, + "eval_samples_per_second": 22.764, + "eval_steps_per_second": 2.846, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.6264803409576416, + "learning_rate": 5.333333333333334e-06, + "loss": 1.5429, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.9659907817840576, + "eval_runtime": 43.9159, + "eval_samples_per_second": 22.771, + "eval_steps_per_second": 2.846, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.2673451900482178, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.5452, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.9610865116119385, + "eval_runtime": 43.9415, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 3.4228227138519287, + "learning_rate": 2.962962962962963e-06, + "loss": 1.563, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.9621055126190186, + "eval_runtime": 43.9598, + "eval_samples_per_second": 22.748, + "eval_steps_per_second": 2.844, + "step": 650 + }, + { + "epoch": 8.8, + "grad_norm": 3.652618885040283, + "learning_rate": 1.777777777777778e-06, + "loss": 1.5375, + "step": 660 + }, + { + "epoch": 8.8, + "eval_loss": 2.9623842239379883, + "eval_runtime": 43.9612, + "eval_samples_per_second": 22.747, + "eval_steps_per_second": 2.843, + "step": 660 + }, + { + "epoch": 8.933333333333334, + "grad_norm": 3.748727560043335, + "learning_rate": 5.925925925925927e-07, + "loss": 1.5587, + "step": 670 + }, + { + "epoch": 8.933333333333334, + "eval_loss": 2.962862968444824, + "eval_runtime": 43.8547, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 670 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.097866823073792e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..70b1d286a7a62a8599834faef456120689d62487 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7a1ebb51c0fb504460695fc588e4ccc53d2b766ae72927e2937deab654da9218 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..b2becb2bb61172d2edde9c916be4bd054f1b295b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a0c3faa0a6d3aae2f23aa146c013216718863fe2c17a7f09c545f1c6ff782cff +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..bb61823d0d78956427b74dd1a3fc741ba1b2381f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c44717b587bf877ea1a37c7f5747a93e45e34ce231c845a31a9b8a042ee22593 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..6f069e8ad5743a7071d53989d6edf25a382b7133 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d33603c9602f50d32bd619f686fa4097b405a474d15f526ce09de1176943edee +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f059bded886fdf7d3314d8c23db369db73e25fa5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/trainer_state.json @@ -0,0 +1,1038 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 9.0, + "eval_steps": 10, + "global_step": 675, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.6091941595077515, + "learning_rate": 6.814814814814815e-05, + "loss": 2.3262, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.465653657913208, + "eval_runtime": 43.8663, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6248416304588318, + "learning_rate": 6.696296296296296e-05, + "loss": 2.4474, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4680497646331787, + "eval_runtime": 43.8627, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.6632985472679138, + "learning_rate": 6.577777777777777e-05, + "loss": 2.383, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4679770469665527, + "eval_runtime": 43.8572, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7147810459136963, + "learning_rate": 6.45925925925926e-05, + "loss": 2.4136, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.466507911682129, + "eval_runtime": 43.8262, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.6547646522521973, + "learning_rate": 6.340740740740741e-05, + "loss": 2.3066, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.4666085243225098, + "eval_runtime": 43.8597, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.7784593105316162, + "learning_rate": 6.222222222222223e-05, + "loss": 2.3319, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.4686949253082275, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.7622500658035278, + "learning_rate": 6.103703703703704e-05, + "loss": 2.1593, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4828524589538574, + "eval_runtime": 43.8991, + "eval_samples_per_second": 22.779, + "eval_steps_per_second": 2.847, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.0344569683074951, + "learning_rate": 5.9851851851851855e-05, + "loss": 2.2796, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.5075161457061768, + "eval_runtime": 43.8543, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.2268551588058472, + "learning_rate": 5.8666666666666665e-05, + "loss": 2.3056, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5140538215637207, + "eval_runtime": 43.8786, + "eval_samples_per_second": 22.79, + "eval_steps_per_second": 2.849, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.172734022140503, + "learning_rate": 5.748148148148149e-05, + "loss": 2.1717, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.511234760284424, + "eval_runtime": 43.8582, + "eval_samples_per_second": 22.801, + "eval_steps_per_second": 2.85, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1857093572616577, + "learning_rate": 5.62962962962963e-05, + "loss": 2.261, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.511185646057129, + "eval_runtime": 43.8443, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.247998595237732, + "learning_rate": 5.511111111111112e-05, + "loss": 2.1853, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.5199790000915527, + "eval_runtime": 43.9171, + "eval_samples_per_second": 22.77, + "eval_steps_per_second": 2.846, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2094316482543945, + "learning_rate": 5.392592592592593e-05, + "loss": 2.2272, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.5180249214172363, + "eval_runtime": 43.9504, + "eval_samples_per_second": 22.753, + "eval_steps_per_second": 2.844, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.243608832359314, + "learning_rate": 5.274074074074074e-05, + "loss": 2.1402, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5342462062835693, + "eval_runtime": 43.8831, + "eval_samples_per_second": 22.788, + "eval_steps_per_second": 2.848, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.7646971940994263, + "learning_rate": 5.155555555555556e-05, + "loss": 2.037, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.609447717666626, + "eval_runtime": 43.8432, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.7528915405273438, + "learning_rate": 5.037037037037037e-05, + "loss": 2.0605, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.5980136394500732, + "eval_runtime": 43.8395, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": 1.6167232990264893, + "learning_rate": 4.918518518518519e-05, + "loss": 1.9957, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.6037070751190186, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.6821871995925903, + "learning_rate": 4.8e-05, + "loss": 2.0322, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.6061909198760986, + "eval_runtime": 43.8332, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8337373733520508, + "learning_rate": 4.681481481481481e-05, + "loss": 2.0411, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.6049749851226807, + "eval_runtime": 43.8659, + "eval_samples_per_second": 22.797, + "eval_steps_per_second": 2.85, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.8310989141464233, + "learning_rate": 4.5629629629629636e-05, + "loss": 2.0681, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.6043858528137207, + "eval_runtime": 43.8632, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.8980363607406616, + "learning_rate": 4.444444444444445e-05, + "loss": 2.0613, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.6042895317077637, + "eval_runtime": 43.9234, + "eval_samples_per_second": 22.767, + "eval_steps_per_second": 2.846, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.212002992630005, + "learning_rate": 4.3259259259259264e-05, + "loss": 1.8818, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6978321075439453, + "eval_runtime": 43.8538, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.1973021030426025, + "learning_rate": 4.2074074074074075e-05, + "loss": 1.8765, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.7161757946014404, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.296255588531494, + "learning_rate": 4.088888888888889e-05, + "loss": 1.914, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.693924903869629, + "eval_runtime": 43.8416, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.0251951217651367, + "learning_rate": 3.970370370370371e-05, + "loss": 1.8477, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.6961801052093506, + "eval_runtime": 43.8478, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2686941623687744, + "learning_rate": 3.851851851851852e-05, + "loss": 1.9276, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.6912009716033936, + "eval_runtime": 43.8498, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.327507495880127, + "learning_rate": 3.733333333333334e-05, + "loss": 1.9623, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6954143047332764, + "eval_runtime": 43.842, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.414926528930664, + "learning_rate": 3.614814814814815e-05, + "loss": 1.8143, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.691159248352051, + "eval_runtime": 43.8542, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.3508524894714355, + "learning_rate": 3.4962962962962965e-05, + "loss": 1.8356, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.721879482269287, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.8192615509033203, + "learning_rate": 3.377777777777778e-05, + "loss": 1.7628, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.8104028701782227, + "eval_runtime": 43.8621, + "eval_samples_per_second": 22.799, + "eval_steps_per_second": 2.85, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.628994941711426, + "learning_rate": 3.259259259259259e-05, + "loss": 1.7875, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7780165672302246, + "eval_runtime": 43.8457, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.5931944847106934, + "learning_rate": 3.140740740740741e-05, + "loss": 1.7675, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7890992164611816, + "eval_runtime": 43.8408, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.94872784614563, + "learning_rate": 3.0222222222222225e-05, + "loss": 1.7778, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.781428813934326, + "eval_runtime": 43.8294, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.8751890659332275, + "learning_rate": 2.9037037037037042e-05, + "loss": 1.8165, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.77595591545105, + "eval_runtime": 43.8138, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 2.7356138229370117, + "learning_rate": 2.7851851851851856e-05, + "loss": 1.7888, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.7776875495910645, + "eval_runtime": 43.8427, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.833099126815796, + "learning_rate": 2.6666666666666667e-05, + "loss": 1.7099, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.781153917312622, + "eval_runtime": 43.8247, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9174482822418213, + "learning_rate": 2.5481481481481484e-05, + "loss": 1.6969, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.857807159423828, + "eval_runtime": 43.8242, + "eval_samples_per_second": 22.818, + "eval_steps_per_second": 2.852, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.4313805103302, + "learning_rate": 2.4296296296296298e-05, + "loss": 1.6527, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8747200965881348, + "eval_runtime": 43.8352, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 2.9856536388397217, + "learning_rate": 2.3111111111111112e-05, + "loss": 1.6097, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.8505678176879883, + "eval_runtime": 43.8489, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.076874017715454, + "learning_rate": 2.192592592592593e-05, + "loss": 1.714, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.85793399810791, + "eval_runtime": 43.9013, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.0812172889709473, + "learning_rate": 2.074074074074074e-05, + "loss": 1.6632, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8589818477630615, + "eval_runtime": 43.8906, + "eval_samples_per_second": 22.784, + "eval_steps_per_second": 2.848, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.3620779514312744, + "learning_rate": 1.9555555555555557e-05, + "loss": 1.6928, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8587331771850586, + "eval_runtime": 43.9077, + "eval_samples_per_second": 22.775, + "eval_steps_per_second": 2.847, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.228792905807495, + "learning_rate": 1.837037037037037e-05, + "loss": 1.6838, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.8473219871520996, + "eval_runtime": 43.8715, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.2267239093780518, + "learning_rate": 1.7185185185185185e-05, + "loss": 1.6363, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.880929708480835, + "eval_runtime": 43.8985, + "eval_samples_per_second": 22.78, + "eval_steps_per_second": 2.847, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.509775400161743, + "learning_rate": 1.6000000000000003e-05, + "loss": 1.5792, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.9368298053741455, + "eval_runtime": 43.9037, + "eval_samples_per_second": 22.777, + "eval_steps_per_second": 2.847, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.7580230236053467, + "learning_rate": 1.4814814814814815e-05, + "loss": 1.5859, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9177448749542236, + "eval_runtime": 43.8448, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 4.051950931549072, + "learning_rate": 1.362962962962963e-05, + "loss": 1.584, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.9179320335388184, + "eval_runtime": 43.8601, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.176218032836914, + "learning_rate": 1.2444444444444446e-05, + "loss": 1.5762, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.9176185131073, + "eval_runtime": 43.8643, + "eval_samples_per_second": 22.798, + "eval_steps_per_second": 2.85, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.1765544414520264, + "learning_rate": 1.125925925925926e-05, + "loss": 1.6509, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.9212377071380615, + "eval_runtime": 43.9108, + "eval_samples_per_second": 22.773, + "eval_steps_per_second": 2.847, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 4.143585205078125, + "learning_rate": 1.0074074074074074e-05, + "loss": 1.6004, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.9180777072906494, + "eval_runtime": 43.9213, + "eval_samples_per_second": 22.768, + "eval_steps_per_second": 2.846, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 3.576042413711548, + "learning_rate": 8.888888888888888e-06, + "loss": 1.5967, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.9157402515411377, + "eval_runtime": 43.9479, + "eval_samples_per_second": 22.754, + "eval_steps_per_second": 2.844, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.2758278846740723, + "learning_rate": 7.703703703703704e-06, + "loss": 1.5504, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9387192726135254, + "eval_runtime": 43.966, + "eval_samples_per_second": 22.745, + "eval_steps_per_second": 2.843, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.234650135040283, + "learning_rate": 6.51851851851852e-06, + "loss": 1.5486, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9641079902648926, + "eval_runtime": 43.929, + "eval_samples_per_second": 22.764, + "eval_steps_per_second": 2.846, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.6264803409576416, + "learning_rate": 5.333333333333334e-06, + "loss": 1.5429, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.9659907817840576, + "eval_runtime": 43.9159, + "eval_samples_per_second": 22.771, + "eval_steps_per_second": 2.846, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.2673451900482178, + "learning_rate": 4.1481481481481485e-06, + "loss": 1.5452, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.9610865116119385, + "eval_runtime": 43.9415, + "eval_samples_per_second": 22.758, + "eval_steps_per_second": 2.845, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 3.4228227138519287, + "learning_rate": 2.962962962962963e-06, + "loss": 1.563, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.9621055126190186, + "eval_runtime": 43.9598, + "eval_samples_per_second": 22.748, + "eval_steps_per_second": 2.844, + "step": 650 + }, + { + "epoch": 8.8, + "grad_norm": 3.652618885040283, + "learning_rate": 1.777777777777778e-06, + "loss": 1.5375, + "step": 660 + }, + { + "epoch": 8.8, + "eval_loss": 2.9623842239379883, + "eval_runtime": 43.9612, + "eval_samples_per_second": 22.747, + "eval_steps_per_second": 2.843, + "step": 660 + }, + { + "epoch": 8.933333333333334, + "grad_norm": 3.748727560043335, + "learning_rate": 5.925925925925927e-07, + "loss": 1.5587, + "step": 670 + }, + { + "epoch": 8.933333333333334, + "eval_loss": 2.962862968444824, + "eval_runtime": 43.8547, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 670 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": true + }, + "attributes": {} + } + }, + "total_flos": 1.10605985906688e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..bfdba7250b6db417b66af12fe1a00a04453e6920 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f53ed9cf8601e6d268cf7cc9c18b648202ebadfcf60854a7a7335f7de7b904c1 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..9d71144d99bf8506a9c6138e294ef67bcb4708af --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a02e96f8e383d11412f95d7bdbc9b5f1ba691e0371d721fc14007fcb7639c05f +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..2b1c959e3b92a9d3847cd61e595c79a1813cfe3a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0bf8faccd3d2ca94b80304c3092e394e13d076f35c0c4f51d74490ac3412d5f9 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..edc613be9a8a7736c1c5e6c411193a18eb94121c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:116e4caee7c9274e6f2a7d93ee5e67e259426d00592030a182ec1bf7e3e1fd99 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..27efa52efd1a679e82b75530da9cf70094fa7f62 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/trainer_state.json @@ -0,0 +1,138 @@ +{ + "best_metric": 2.458498954772949, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70", + "epoch": 0.9333333333333333, + "eval_steps": 10, + "global_step": 70, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.14702503903232e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..05158166ff25e26ac4725f976061befc623ffc9e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5b9043cda8563d5c0b2318f0b2fff17e5ae4453ee006dcd1352fedb6c6633055 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..2736c99f03c91115cda1f02eb16e8d09b518a9d8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6c36636d55776566671de6522b8650f1c6ecb2ab0e65eadd494c5dbb47f37ad4 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..0b228b8e8106f666fe286c5d131d496d926a7df4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:debbe8bbbf3d0dfd719072ab48974c332b6f78ebe25ef99f5002c8d0a8c8c380 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8bf25b5c8780313aa53c49c9a020653afda88fbe --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8e54696b8c39c3b120a2b1d4d03623aee6400315f6e759074fafe42342c8bf95 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..7a498a14f782d5578ffcf5b9ac222d6297d09b53 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/trainer_state.json @@ -0,0 +1,153 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.0666666666666667, + "eval_steps": 10, + "global_step": 80, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.31088575889408e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..1fc24a8f3885ad2fc03a1aca8828450da7b5d9c7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:22c53dfbac922714f7e1bb3614ef4994ef4252d1b9160a9ff31cec668f0ca782 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..7c94b0d9136493e944b6562c843e8836f1441255 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9f67a165e8b7fc9570f7df1fe204ff298fc935e51ab4575e6e14ac393489a1d6 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4041231f7cc289aaec627b941b3ce1ed104a3678 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5e1884689751e2c9aa53b83d7472089621e5727e27a037b479e2287c7b208b1a +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..d1e19095e23644fde7d19dd9320fdb8daf7fd2bd --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:28209e35c6873af016e1c69801c50fdb913d066bb8fab0d3da00cafc566c1a5c +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..efdcb79edee24504b3f4246352b8c2d8ac50d9b0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/trainer_state.json @@ -0,0 +1,168 @@ +{ + "best_metric": 2.457939386367798, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.2, + "eval_steps": 10, + "global_step": 90, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.37896662950515747, + "learning_rate": 7.881481481481482e-05, + "loss": 2.442, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.462397813796997, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.3559034764766693, + "learning_rate": 7.762962962962963e-05, + "loss": 2.399, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.461599588394165, + "eval_runtime": 43.8364, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.3543637990951538, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4128, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.4604127407073975, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.3267190754413605, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4215, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.4596850872039795, + "eval_runtime": 43.8523, + "eval_samples_per_second": 22.804, + "eval_steps_per_second": 2.85, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3119862377643585, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4358, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.459153890609741, + "eval_runtime": 43.8401, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3016722798347473, + "learning_rate": 7.28888888888889e-05, + "loss": 2.46, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.458838939666748, + "eval_runtime": 43.8539, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.3059026002883911, + "learning_rate": 7.170370370370371e-05, + "loss": 2.3836, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.458498954772949, + "eval_runtime": 43.8362, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3352043032646179, + "learning_rate": 7.051851851851853e-05, + "loss": 2.3585, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.457939386367798, + "eval_runtime": 43.9016, + "eval_samples_per_second": 22.778, + "eval_steps_per_second": 2.847, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.41775715351104736, + "learning_rate": 6.933333333333334e-05, + "loss": 2.3486, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4614298343658447, + "eval_runtime": 43.8716, + "eval_samples_per_second": 22.794, + "eval_steps_per_second": 2.849, + "step": 90 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.47474647875584e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..994769391ae29f8b84b96b312773bcb8e5f602ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39ea346efe47de70fa17516f560a06a4c89baa268c191ad9d1f4e6487680e776 +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..3d2c9f205b118e81c6f25e19c3a0bcbb539457bf --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0385408d00d01a4e26917ff980ed727bc3f24ddb65ed53a970e4aec6a57c67db +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..2ff93afabc40611223dea56a84272d053744e2ec --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6f418584fc113b3d1e627edd7a1b2c445b90c9aa21ce0c04f05f10d1ff8523e9 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..d0cb160fc6752dc0470bb88b1ba16dca7ed969ca --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fd418aa175a4f9508778329e5c11f54241882ad7316c344103bc3804e613599f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..c7c7ba5a5d73c30d2e2dfccf92552709b61b1a0f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d4f7e5b3f15e6248eb69742a14f905c700ecf357f80b4e2f91b8b83b2a38d15e +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..ecd24d76f29d2eed35028b8c9ae979d158272e7a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/trainer_state.json @@ -0,0 +1,48 @@ +{ + "best_metric": 2.4511938095092773, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10", + "epoch": 0.13333333333333333, + "eval_steps": 10, + "global_step": 10, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1638607198617600.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-10/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..412a38aba83c08bd277f3d470dbc9a909269b518 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ed1ea53efb4c88577f29d20ce68a2e17f7792ee841ee2098e5b63a926ccee463 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..f2ca42d49a46dcd2f0a930488bf5aa0346133cd8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3de25453df763a9c5a9b4d7fd09cac35d7cb1bf9af7ae91c8d05d42221bbca53 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..e6cdf36295b4d559507cf0b068680edea3de3a81 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:46513e9b1de488f3d70a4461303e6b827989f588807354e14d010b7ee4f4679f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..1584a89d720c1148f5d687593bce1afbf3ca540f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e3724af4ce6a1f0bbef2eeb9d3766cc9d735b5235e3ad5e31823d425e48bc174 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..1238c3259c4ad9e6a1569633dc69afb71f3e4ab4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/trainer_state.json @@ -0,0 +1,183 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.3333333333333333, + "eval_steps": 10, + "global_step": 100, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.6386071986176e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-100/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..639012c64904b096436b75376e93eed930fe0a38 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bb49b5e212b9d47ef90156fdb6185532549880bdd2c0d57c0fff2c1e40f96dcc +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..27b8135ebacbd2498d8d9a2f8d720178c1ec4deb --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b12e060acee9cb4e3c854bec9d1c84b8111a50bb86886b4b7d005add3489e10c +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..e8b03e39b0cf81b4b723b9421b9fca8f87c7b414 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:319884e2d6c1fad0795ced8add37e8073910c77073120da512a5e6a1f6208d62 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..06fd1d13c97f1f1f22fcaf41207579844f1ca562 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8c7b0ed85318129aa45243aba4672dd190ff0cab7af3fbc863e9470ae3ee5518 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..0ce7c8f06db7a354527be1f19b0f0a6c9c13b427 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/trainer_state.json @@ -0,0 +1,198 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.4666666666666668, + "eval_steps": 10, + "global_step": 110, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.80246791847936e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-110/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..1647fcf866aba0358f0ae2ef87d795c01da3ec46 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b211f8f5ec8953a1dac09c5782843275e21157dbc7bb6c9b9386fe794e4c3f73 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..7a68b6a9ea4710ec228489cf062442259fba84f9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cf9ae167a32f35373ac85522dc56cc8865691528cc517828054b2c551e6a5b17 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..71b7a5227226dcaeadffec096acbc7df0f632989 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3500ac793bd5f15c49da717801f854f9815260499ab4bc16b8f3a1ca9c82dfdf +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e63acccd70ed72b56fbae077aaae3c3e860df98e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cb852b3cb535fecc52ae9194cdcdc2f881d7fe32b722d7aefd0c2949aab38351 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..e41bebcf4d29a56a56cc5fcfc4e49da38d46e39d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/trainer_state.json @@ -0,0 +1,213 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.6, + "eval_steps": 10, + "global_step": 120, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.96632863834112e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-120/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..f1d47b2fc025de13309feebdb4a5525fb8fa9bc7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:14f4a425576c84bfb49e607555a0d75b2e1de4d56e22b5cc2a379698d2e2c004 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..4de072302548855cfd07e36ddb3c5f6f4e725986 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2f4c96df13d7ad9ca4d09e29b4d455830c22f0041f959c8589febc803c9c99d5 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b60cc4cb8217ae694c7a8efef0eb0b676d897e83 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:602f503f7cd2e84c0b6719714b66d34e98b340f44b02ba8ffc44df096e786100 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3e6bf31460bcee7a7c8e2d3a44dae57ca9756e84 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fb3d4fd73594e642debe12e534d13131882ea6c660a00fc6f6b39408ccaa0be9 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..90bdc585a42d5306264ffbd3898d5c41cc7f2539 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/trainer_state.json @@ -0,0 +1,228 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.7333333333333334, + "eval_steps": 10, + "global_step": 130, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.13018935820288e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-130/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..335027c06fe70dbc0a43e8db90a8422e0986406a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c7a8e87226018ca7e7874af08bd88dc853db4829e39bb1b590abf8a69d7753ac +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..a150855d467cb211bd0613edc52007afe494f33b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8a445900d8e58e25a7ebf096de7409f23511b4448d09ded01efd683c91785908 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..d05f19f3c7e1e4b728f62f56852d18785b6ab4d0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:03c218af617af689aa7eff2d02ae91fb859e96fcb9571b641c5e95247f137dda +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..2037cfda59389ad04c653da89dc5ece29fd8a1eb --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e78eb0e499cbcedf2065f612c2640c14e35853bbb671a435f85400df13b65849 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c2fc9bcfbcce243b42765e8bb8923ce2cae57bf7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/trainer_state.json @@ -0,0 +1,243 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.8666666666666667, + "eval_steps": 10, + "global_step": 140, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.29405007806464e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-140/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..8f5e8b2ccef5b52ffb62100f601e5471fba4733f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:77296ebcc6bd9d3f5ef53d5e7c7b50b89d878d28446a157b0f85da86c8f38fd2 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..abf3626ed4d24de4fddf76552857c304d18a0f70 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:904f544926df12bf11b8ed1153520c603a837e4fe9e611be503d62b611277b1d +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..61dde1ed8b180510bbda84f0c71356862600ad55 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bdf2188bfe5b1127367f0a0d0628c845d9f54239950b10ed26be9372dba68d0b +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..2f0e2eccf4636123e13847a06c34756ea86e2ce1 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:01d6ca2c3de71fc307a0c56f31552cc216bf90d6eba21471dc5958340a2c2285 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b41c4a3ad25f6e3c549268e3368698957e70b90d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/trainer_state.json @@ -0,0 +1,258 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.0, + "eval_steps": 10, + "global_step": 150, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.4579107979264e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-150/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..803350d7f3b051fc9c7dc5afbb91566671ee552d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:256dfe8fbcddca485feb099ecb242ddd150d7a577a88d9a308ac0aa316410edd +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..fd6ea09fee22c94bac524b3730f7bc33e9784861 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b28440ec9cb461bd440758d11f1ab0ab0a19248d5b06634dff81316f5c9bd452 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..564fc6da8e7c6b2c0f5b62f1f2e55b96ec29c066 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0f1a4ff62819275ae908067e10e49db3630270d7e753db72e5d286184508926f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..7003714472211c91011c0e61ccb96ea383add5d6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:086780aa9b208bce3e425496825484b08be311d15b0b14460f1ffe79ca5724e4 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c9ffaf546fc59d7d48bfef532fa7320a791c4b13 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/trainer_state.json @@ -0,0 +1,273 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.1333333333333333, + "eval_steps": 10, + "global_step": 160, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.62177151778816e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-160/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..af35e28fea45d00af3478bc1b36018eeb168512f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:544a42389e041abde691fc254e0d8cb56f8dcd69e6223aaa5a217c6da02cb685 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..5abcc1f6d2b861e8d6f17dfaf7686d70ccc29a9e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8aca59f85d8b25bf67a7f52996ce34a02733cf576a4e03fbbae49af7ec0ea2d1 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..c13cd397e2cbe97d2fb9e944d382c58418c6b136 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:964f6178720317ac51eb375c889b2d86c7184aa024caf52b59339853ffae03ca +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8ff8931a93a05785d4f9ac874ee996aabb0f9c6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3af7af4f3079ff2528b7320f4eff72c9da8c339b59b25326a218150720679507 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..59afdc53a8eb8a63f8bb3c49ff1069d2c4d91e5b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/trainer_state.json @@ -0,0 +1,288 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.2666666666666666, + "eval_steps": 10, + "global_step": 170, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.78563223764992e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-170/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..22595cb0591f5431de0e30dc342b77f33185f98a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:aac4750d54051bdedb1f14aef158380be227fa569422b5e9843c62373547121a +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..946518d29f14f10431c54022996a0f9bc1919f93 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b980adee1181c97a75820b005da20496e41167c1d1808e1bc6cc1af071f76f76 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fdca3aeb31ce5b4aeb2c0f2ba53e3e43b6334331 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1b79baa0842c2916b082cba36f9f2b958210e6d7c1813742841fb908cae57fbd +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..29c6a3e72ef015195c79fa565499604a99c3238b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f03b6789f6fc5c23fa97b574dd3da7e3a4ab78408a43f9c087ae8f29ffcbbea4 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..743f1569f74b5828308282127c44d457401510ca --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/trainer_state.json @@ -0,0 +1,303 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.4, + "eval_steps": 10, + "global_step": 180, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 2.94949295751168e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-180/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..cf9b901398509e967f244ae3e016378b0d543012 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a589b9d0da2ac03d3a8e38ae2b5714e0caaaf0bcfeee7f8733755270977b9bfe +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..47b26c0a67908f55d0c3deddd6293a0f986bce11 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3e700a5934986f342266bd03b110d99618e5881742d14a80808ad56b69944151 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ae44ad6727cf9b3af903ea84902fa6c7f13a5a95 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7d6f4346bdc8a12fcc48535a6002ac46345e4ce1e14bb1f7e9dc3b0ea920641c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..b138eaf1e7294b95b63cf0f5e77aaa5651aa7871 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bf989c6683600055d8c21edd8b036fc8ec3870e69609cfeed852c01d4fb783ff +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..a182bd5fae73b30b2ba17318963eccdd666ce5e8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/trainer_state.json @@ -0,0 +1,318 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.533333333333333, + "eval_steps": 10, + "global_step": 190, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.11335367737344e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-190/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..f0dfd3e78a623ab81d425b2faaf90c263ca29ea3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f73d08e8b804ffac925f9293be9f7867079d71ebb3f170801a4ccbf5f37e9380 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..2cff083c0d78d27b0fc2b37cc7dc9e0470b9205c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:03af84cc697c6d7fccbdcc996280ac4c3d9205f59162c1e1575c05a43037ae92 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fe515b4492af517bd45c5a5c7abbba2b94c5ae37 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1a5087ba42b4dd9dc68875c89890b692068c71de7009ff67cb7d8492bce11049 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..61e40aef0a507fb8add486ba2535aadaa164b9a7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:72a91e63074e9f0fdfc6b1e7414643f389732ccfdfe97b6b3f4c5b0d7a7556a4 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..41d4812f6ccf16830a44d05a7567abd2966117d4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/trainer_state.json @@ -0,0 +1,63 @@ +{ + "best_metric": 2.449139356613159, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20", + "epoch": 0.26666666666666666, + "eval_steps": 10, + "global_step": 20, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3277214397235200.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-20/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..ccfe118a573ac47e5a4c1a714db1f737c0a505f4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6066c2b981bc254230faee864073161ffd470f183c16265688d686a4936e66fc +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..b367d5d7b4e1e572a87ba232dccfc82b4ded0b17 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6a586de25a3732c9c5af9dd46648ff2c46f8d12b680ae358906ccbfcbe05fcdf +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..da263858f32b7536e68a33626ef41e3ef7a44689 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0dbbe288070e588c7effbe11249d330a3ad16131211e6b5dff1d03a8ebc7517f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..b0f07f929fecc267165555e571b711e64132e6f7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7f688f9bcdc66390bb2581a665773530bb1a08ae9bdc1413898ff1347742f43c +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..850ff5e77334a2e7d48ce36d6e82cd520e800afa --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/trainer_state.json @@ -0,0 +1,333 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.6666666666666665, + "eval_steps": 10, + "global_step": 200, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.2772143972352e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-200/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..bd52f68637e3713a6c811bd2976d9f901dafd201 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:818c7ffe25faf6d7bfb460f0c1d590e15e989980a354533b879d364902878df0 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..0090c20c61ab6994b06b7c2b2421646628d44e4b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4b8792e2246df49ccf97324ab4d4e1e6e04c7ece9e0e0ae943b2a9c81e70bc7c +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..605214081e6b3060d6c3e526fc86e8b8fff3c71b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cd4e0019fadc179e2ea531ff33d86db759cb80e64a8826bb6bfa90c2483bfc04 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..67ce491755f9c5f753c42b9ea33be81dd9f8a131 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:085996a8194241f2fdb489a02e8b9252649984b08945b06670bf62ad13b831de +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..1b1b9e11d4cd3b9fb6f8b03ad8b3c3ad1be0495f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/trainer_state.json @@ -0,0 +1,348 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.8, + "eval_steps": 10, + "global_step": 210, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.44107511709696e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-210/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..12d1b7f21fef0dd05b83d58b14541bfbe794d1d8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:aee1b868db23666d1809e1f7517367d4a07d595fbdec0dcd32a0b45371cf84f0 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..16c80eb67513d75a6de8ff8745b9d0d7c53e5981 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cc63667f06f8fa872b53a405b306b29d501c4bdf61ef7bc45b138798583e3c81 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..823c878e3ad7d7799e1959fba97c90aaf79af4f9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f5e4256f7b7ace2dd6194570c191ab9026456dc0db24025edac4a5bd9e379dab +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..ff86f52b850328c2d8f4d1f372c5649cff03e543 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:afebe0583be455041040155b8cf242cfa36c2fd3aed81f0bb042574b3d11c816 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..2591e501fde5631128347186b636ee3be33d0c73 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/trainer_state.json @@ -0,0 +1,363 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 2.9333333333333336, + "eval_steps": 10, + "global_step": 220, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.60493583695872e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-220/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..6c51fce37381a9702cc6cfd8b443785f25abec6f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6f4996867bc2c6a0e889fbd21e6856d685f7b9698bf87f75cbdd81b8d94b5329 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..a000b62057078b1ad2b2ed21789fc600e545bf4b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:140c8d537460218e8c0856d02a04de80a2d6bd475463d4a1a490d12ef89b8b88 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ae85ad205796b2c3955218eb7b4b348ca35978c7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3e2b38199e26ee1965ef79aea019c0217039e7dab109a4b6e29c57f1bea63d6d +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..b0031e0c496cf5abb25ed78d6e38e07a220ce043 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6d1356fcfd01ae5906aac09e7ec62f27b1afad01eacdcb84f123c2c4502886b8 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..111844576a16e074aa7e8bccd8e61ddd94e0ad66 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/trainer_state.json @@ -0,0 +1,378 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.066666666666667, + "eval_steps": 10, + "global_step": 230, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.76879655682048e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-230/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..0a2cc79a0c77fcca9d7faebeaa675861ad39e370 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:acf004e2bad8828d41009b3628a2ea32a936db49870c1d9c3da4e91305defb74 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..b1ab68c775b725e03370d7a029ce9a6c1c266107 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c2a58e254a7c1de9a4307e34675dedf9d9094cd1b800a0f2885ddc7a5ba15622 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..846c31e0418b3b3196b4e9c5d730a866c947d1d6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:33d7857a6e3603508425c326c1a1dee439799d2c72bbfc8afcabbb8578757780 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..3e7e86f4a758107d182040d0a6b86e40bcc72f53 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4730dfe02541191e7acbe6178fa9907735bc80cf60dc356460f9a0ca3075aca2 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..e1bc3bf71432b0c5744a4de412dedee755250ac0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/trainer_state.json @@ -0,0 +1,393 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.2, + "eval_steps": 10, + "global_step": 240, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 3.93265727668224e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-240/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..ebd01e30300e86055590db3a8f0c21f3c225c1ac --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39f546f439ade09ea6df33c07dde42f239a4dbbdd1a55c235885912e0f0f58a5 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..c72e1587b2ee2bbd9ab31e1c665e66c908e3a403 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5987e2efbfaa65a553bcc612d650f01ffc73dbe98483ef47296d2cef4d16c171 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..90df82c0a610ae490c2592c79d46fe23cde8d351 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1a5b7a10b9f8de84d4eac8f0b5437669695e0a3ed004e055b39340577de17c55 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..dbd2dc1560f92e541e8ba7bc84bc9ce93c9d1a8e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:cac52f10d00ea64a7e134e5021b8e94e73bb2935342d54982a8a3e42dea214b1 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c60976329066da03ca5180ffd56084accea245e5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/trainer_state.json @@ -0,0 +1,408 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.3333333333333335, + "eval_steps": 10, + "global_step": 250, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.096517996544e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-250/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..c4ec2883171c31ef30bcd5810b6ac0578f1cdc48 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:32cf4772b8a1eac1c5d7e60e4e1d7374f2954922607fe76edd7ffb6491f2dcf2 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..5c32469d6fe53be40a3623f57350c6fa8de7199c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e522b769b678c0d5e797a0b18542abde75df4d15cc3e8ae828615fae83ca1b2b +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..293d181974003fee2540af0648cfb4e42786ca56 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:78bbc69e88d5e1fb15138660b4de76d03b9476fa1ab2d16370f894a65eab3da3 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..bc0232202f7722177b07c8a15249e87e3fada2e9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f3319eac6e1df4ca8a75010892ff7c074298859adaf953994c8af52485f55b53 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..3337b95f8c59057f5ee8cfc1cf1a08ff96a0e87b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/trainer_state.json @@ -0,0 +1,423 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.466666666666667, + "eval_steps": 10, + "global_step": 260, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.26037871640576e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-260/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..0380cdc536859e4ac9341020086062ae2a3c8ee6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:92c0364270dcde5af6a760f8326ca70060b280ae1be20b6a4288b0d6c0428948 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..6499dae2f724d37f44b3967edd80fc7f63064100 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c5df14818e51e63f493ad0b8f00d071f79877249dd74823e0ce39639eb376ec9 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ba62c782c818c1b90b0344e262a00bb91255dc87 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2af2c0de08ddef877a4af0e5f2dfe4570d2f029659f125fbfe3bbcce3a8b09e6 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..aa7eeb3a87ea1ca1e7352f08759f4051aa8258a5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:abb06701181b90f771f65b0cba814986a9b7ac92e4591d8a01aa1c6efba77ce1 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c284c8adf609678ab938dddad0bc16d51cca55eb --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/trainer_state.json @@ -0,0 +1,438 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.6, + "eval_steps": 10, + "global_step": 270, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.42423943626752e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-270/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..69a02db43698879516ebaee5505ce5aaca61c243 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c4f46ebc2ca023d0dc190a94b2f53181b3bef1177601cbb99495834c7139bafe +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..93937d1d480a4ba61c121ed51b3e228c252ada93 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9a8f5804cb27309c61f7a506eae7960b751555115c9e203514d1d7bada82af10 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..1702f62666b39cac633a34cf312f24e311e13df2 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9ba79aaff190fd3ef9f70dd7c0a234665c2bd6c6bb243b5896c5bd6a16356627 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..4a657fe9e903dd4a89dd04673e22a17c86dbad42 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:819fe3dcc18383ccef3dc2fe508ddac3e00702050d5d602aaf094a47cfd838bf +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..7b5047b30ac2b419a8b0b1f32cc4895b56a2dac3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/trainer_state.json @@ -0,0 +1,453 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.7333333333333334, + "eval_steps": 10, + "global_step": 280, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.58810015612928e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-280/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..11792b4bc0b4ce5f70ccd95763d6b73dc4ab7392 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2ec726de5f800b46346579610d57571196c9d7e718263429cd5c4166e3e8bf8b +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..f9b99322628922dd22148931cc21088db631087c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6b7de30956e6c20d761da5bfde5d9f93ba015439598ef4eba6ac6f184df55d81 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fecfedbf1488a31afeaf7c01dc4f9760cfff1b16 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:47c6345b8afbd1f7a687e942ce33ce022660a29cb46a23e4c9eda9e498053741 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..933185741fddd4e80babf7cc8a77cff149bbf3ff --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1b158d5ddc1d2b97945724e562a4c35df5c18023a4c36f7bee8f109155034bf0 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f19b83457355d0b398369dddb5feb154624bfa47 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/trainer_state.json @@ -0,0 +1,468 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 3.8666666666666667, + "eval_steps": 10, + "global_step": 290, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.75196087599104e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-290/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..af942f437f20b27bc236ee8d2f4033d54b2fcbab --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c3ede9883916f3dfc420ed32974f93baf57d2519654f2cea4d5852b1462d727a +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..6e4727c0f234f547fdc7a9a0dcd864d5a2944494 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e17de2bd941af542344be2f411cd66e8e448758b964eff953296303d377efa8d +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..76ee62462f7b8b87edaf24539d12d81995c70164 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3a5478e4e53ebdf948038ed344f6e976416991ec94630cb094a18d5adf7aae7a +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8e3204abc81bf616d4220ccab7f0f13520ce949e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:19debbf018dbf40b240b0a2ef65d5d10de2fa92e61c8838b0319c8c96ad962cd +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..ad72c323f8163a3c9cb631531c44e83754c56a13 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/trainer_state.json @@ -0,0 +1,78 @@ +{ + "best_metric": 2.447827100753784, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30", + "epoch": 0.4, + "eval_steps": 10, + "global_step": 30, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4915821595852800.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-30/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..69ef9ca69330ef80d0d9a72e022a499d699afbb1 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:38f28e863cd28e65e42c45ad5adcdab8486afa7a40d6cc99bd3ab8a42f3aa497 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..136f4b51d45fb76d6fc669e27a2d4cffa61d4718 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:65b9d0e71af1024166e329eafd2f401d67f86a5d2c77ce65b3a879b4a6d4c6ae +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4d8ba268ef07796e970a23442889935701a1dda5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2574c6149307e492ef05d2031918a546356cc654f4671c817f05ae6d0764de7f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..4ff153ed07693f90bd9914df3d3a9100087ba152 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4b3861b9f1c59dbd4e1ff81c91a81a79174248e3676d03815d832ad9defdfa02 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..663a7e4bda9781f063046692a291aedbb970f316 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/trainer_state.json @@ -0,0 +1,483 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.0, + "eval_steps": 10, + "global_step": 300, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 4.9158215958528e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-300/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..161bdbd709d4529608b841cfe73e287b89d88e90 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dab4e4cc274f7d923d0bf99a0c21049c0605a5cc3b9bd0a95de16fa782e70217 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..9db2bdd8cc402fcafd32319d79647ca29b61b13a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f425df22b5d89572924611fd72eaa28e80a896e89f0b1f42c3c32e845735abd8 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..a5b4503b006d8dec33c7a086d3d007eef4282144 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a82d768c5f5c231c8b50481a409281b8639e231a185281a7476164488eb6c27f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..1e086272f5dd0984e6962757f823d9f2aacaf771 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b111834dc6d7b66a1fbf92ae20f97bc4a522817eff7fb700836a612cb5f0fc0a +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..29071f1cfaaccdbae913c65e6395bb873fb393d3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/trainer_state.json @@ -0,0 +1,498 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.133333333333334, + "eval_steps": 10, + "global_step": 310, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.07968231571456e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-310/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..f05e4a2b5897ac187bf7653ed6b7ebf04a3e2300 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c2fbafc178da1bd5271f6c1532b44ab3d3cc1ad6cdad2fb79c0954617a9af75a +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..9444412999a6b62bcd72fcece70723858e5aa284 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:03367532e9e714344685ed388c370420bb3fc5a2dc0629214d528797f214f6fc +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..f5fbaf3739704eea759ab29b4b9eba0fecf79ee6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4f581763059f9808c6971d543bee5e034fff1a9ec174cb7aa232dd9f17099da0 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..06cc1ff50a21839b3e48cf0665d3cefa585d78e4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8115eeb961355d33e9b94e8195b37bd4c7ed81f7ec0848bc1eb1d9cac2edc6de +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..fe4c0afef02492044e26ae43513a8a62fbd14057 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/trainer_state.json @@ -0,0 +1,513 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.266666666666667, + "eval_steps": 10, + "global_step": 320, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.24354303557632e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-320/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..70a0f27d007042cac864139a909e8761a93f1075 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:74a9ac488b2f6ef651bc594236b9943b8fd09ecd7a2c740264ceda8cf72915e5 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..d5567a346c9647bbe5238d7925a3994bcf254d69 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:398e73b8ff2e1685ee5b019d918fbee67224335a4145327d196532f245928051 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..759bff60bd0897427bf9d4410df520d35fd20081 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:389caf1bb32aae3a751e11d63ffe273f089df59490c4ac6e5883d944b329df0b +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..510498bdfbf927c4f91d20f5df576c0870330f23 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e7846e9dfa198e40139a7197164c195f2df220569527afb56c302574b26feecc +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..238c3751accef731a6c547a3fef51d50cc89fc08 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/trainer_state.json @@ -0,0 +1,528 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.4, + "eval_steps": 10, + "global_step": 330, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.40740375543808e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-330/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..0000bb0de9a72460067bb4c6c3a98def2dcc4631 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b6a72edf9b3b25ebd4dd28d1a4a28f68b6c7050af48fecdfcddf528b4a5f5c93 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..5a23bcd86f05d6997b9d07a4e357fb7187946370 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fa8ff3aab772ee7e4cbfcc13e6649656eb3100efa399a5337f2b18d63ff8cb83 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4d7fc830aabf2c4827b0609ed6e355d0fa80523b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b904f845552beb994fcd34362e728f918c7473ac27288d463195b51c3ed73bff +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..b2af0e25b5eb9ee316bb52290f784d1b888afec1 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7628a0764a483902d1d79e14dcda9f869536b36387bbbfa74ac4b65d7d6f31c5 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..86f347d8423b2963cea704a4c31afa68278698ef --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/trainer_state.json @@ -0,0 +1,543 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.533333333333333, + "eval_steps": 10, + "global_step": 340, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.57126447529984e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-340/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..2fc70eced3404927ca477af1d470a03f6414889d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5c65fc93d92c6372d98e700306e6c1330ecd291a79751ee167b2d1922f8bb28d +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..2438c5217f930d080c39477f30edcac8c0f1b39a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b83e2607e4bbc2d020ab5deaba33f54c35650a1a4c2469299aff3f5759cf3064 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..fc3bb37d365dcd8ae3528d8e7242f7d2eae755b3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:39cd0c0a4049d541d90e7c6154cb21167a341830884ad3558195617942678446 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..32ed638e5f940a93d6c1edf07c1078983c5cf18e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:43032740b7c93900a39d6cf49e91bad79346f4a3eb668022fdd934ce6f663c06 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..5d2ad33f0a9483733ee81af13d1bca6abbf71e8a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/trainer_state.json @@ -0,0 +1,558 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.666666666666667, + "eval_steps": 10, + "global_step": 350, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.7351251951616e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-350/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..60a9f81373aee397d2a893b89dcc298b7c1d54e5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8e2a66366d77ed27b11764dd0ce34f8720e29986a5c57a9665820fde711fa2ed +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..e092a085ab9710a63ae9324f50fca3d8a6453596 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5f92c07a599c3a3b912afe0ea9ba55a9952e6d5373d7bfb3cc1dc7fcfdc93602 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..dff7e422d3f8fc71ea77fa33b28878ffbe8abd43 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9d73d43b628bfbe3f56e29099c04e9e9584349f935d8148aa8c34849bf03ef49 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..2bbf4f07128584a48215b358c0b77d0be32e6b39 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e8ee70f63040dd0a5060dd1b3765434dd55c79d34817b88238d67df5aade8bd4 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..aa401e5e6c185f135fd9212d6be834d73d1936dc --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/trainer_state.json @@ -0,0 +1,573 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.8, + "eval_steps": 10, + "global_step": 360, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 5.89898591502336e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-360/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..6204a41ee54d0def2677fe76f70049bd168cd86a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f5da5dcee85937ac749c2f56408b12f8ef050d819496e37a62591bd9ba886bfc +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..4c1dd34d9d6f2a1dfb6f6562dcbac145a2848af7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c6f931fa83d0ef9380c50053b7a9a04e9ef617137923548c73babf09e9eb1a23 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..792417d4c800bc4c8f7eb21d5421678309a6165b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d7c0e313f3d6f9e1adc7603b9ffa6f0ab3438f71ce0c71bd9a788485d02b981c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..48e2de7dc1ccec6d1e3ab16051752f328cfa6a60 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1bb0c401c96ec4e317630d7674770ec6960eaab0c5b3b1f1a7a97a19731fd7af +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..4adc126fddf3a327fd0a5c0a0137d5feb92b738b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/trainer_state.json @@ -0,0 +1,588 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 4.933333333333334, + "eval_steps": 10, + "global_step": 370, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.06284663488512e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-370/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..8c3d39fea5438d1afd0e16a29daa9ec0fb31bf23 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:79dfa9fc3949969fcd2d5edca73310368ed523873ef37a03136a9045b255aea8 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..bf5fb5398223029a257b664f46be7ca2464b563b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:56fa3b0a4e6434912e6857ba49465859ddb63d5f884eaf9e14fb51f21f9bcd72 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..f3b952e81c9ed8c37528c0b9d4c13811ac0b62d3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d5ce5744fa32738c65fe7785ec589c49d96370233c9386567c3f06dceedb5f2c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..6af3365ca4cb82ed9ad457f2e821e52edd19005d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0d81e7503e0a4c56a7520dca45690146cd6bac0af0677c1ad69c01c54b4dc47a +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..bf83a37e63a97cf90e6b4c8fefba2d67c0f0e0ca --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/trainer_state.json @@ -0,0 +1,603 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.066666666666666, + "eval_steps": 10, + "global_step": 380, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.22670735474688e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-380/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..cdb93db2a8b34c22367fe85eef3fbc4f8a30e464 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6e29b1c6763ea30d2018eb8c01422349eacc3b7477ac6f3cc696eafee5b673f3 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..68e1795a8b0261c54517315bb0867ec73f98dd74 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dda288fd152601b6ce4a5a9e81db97e2d65fc1053014cd029691044eacb413e3 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b458d8885e612e71d79c420d6ca3a40dcdcf7fd8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f47a6a8940dea009f3b7ce239248233dd458275df17acc4fa8ff99eb346e8979 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..5d09f975f6c31b69b0a1472930435673bb722884 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e76c5e20da406e2a151a5d5fbd113c04547a6f95bca1af646e6494d7acfdd357 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..39289c0ea0fe6faea94307dc5009b6a01828922c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/trainer_state.json @@ -0,0 +1,618 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.2, + "eval_steps": 10, + "global_step": 390, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.39056807460864e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-390/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..2085ba6c7455501438008556beca5edfb867140c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3d4a1993a12c271208cd468927d1c68fdbfb6c7efe3d989c3076dcc97efe8532 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..43d8e07820ab2b4b50fbac073a139f3392b5a5f3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4edab51f670af97935a825abc3c64d5c2e5fca9ffa11de704b6478da01617ed2 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..cc0cb9030af17e56f3ab00fc0ad6850b4636069d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b5fde33a4ff115b0a519c0ef179183e0540c837c91cce3dba97312fa8e725570 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..1159228ea69439db76026731513cf5c71e57f3eb --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5f953d62fd365ebab5cb8aad6e7c0cdb075e95f55a4cb36b4f4e0198710f2320 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f0dc3b3d8d8de3b7d1d9660635f15b2df4f9d480 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/trainer_state.json @@ -0,0 +1,93 @@ +{ + "best_metric": 2.446575880050659, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40", + "epoch": 0.5333333333333333, + "eval_steps": 10, + "global_step": 40, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6554428794470400.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-40/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..122f8885ab8db4b50b4882175295832f8bc19843 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9f634e4ac2d1a667c9f5eeeaf81239b97aa2e1289604c2bd4b64348e6e1a67eb +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..4cb48ea860c1d631eb0692ee4723b29582d07e71 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bc6f0078dc243b7e12f3639a7dd719b1ba0e2db07fbe1346ecef084a2d44b4d7 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..d06e3c475517e0d14c13a6ccad84a3f20110949a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:96f529f9856ab8a411ac6b8078e33cfc18c0159c4947cd8cac8e1238fc1754c7 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e8342e5b3394e0fb71196ee110a91ffb59d4e6a0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4ac856c0bd55274bce8d77e7de111f1634e866a734a52c31a2de6d86eb94147e +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..89e9c7b1a880b74a714010f810257db20e641d36 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/trainer_state.json @@ -0,0 +1,633 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.333333333333333, + "eval_steps": 10, + "global_step": 400, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.5544287944704e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-400/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..cbad2dbc4f3ce879011ec025bc1080c167a16c03 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f1996a696cc9449de49164d8a7f31e682d891c5b7af79fb48c9923e64bf78b42 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..620175bd6a35368de76fd3dc83b5e5fa6033d3a8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:41bbfc08f3afbb3f84ae0db00f155372971bd8358ae0e6321689bcaa89ccff7b +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..090a1de878697aa3e6255ed23ff26ce6e561a9fa --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2cab01f3c0a9d66cf16eec91d8aebbfd533628e45bdb849b4c3e4ad317f15270 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..a3accf1fa422cb508716beed4716428bb7a2dada --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:65925ced20001fb9dd7a174b107834421c0b8463f908ce5ae76eff70794425ee +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..3c74ae44c6cdd9b94c70016b9504197c04fba143 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/trainer_state.json @@ -0,0 +1,648 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.466666666666667, + "eval_steps": 10, + "global_step": 410, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.71828951433216e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-410/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..545eaad5073a7bf1c5c760f8b8ded3b5d847c6ee --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5e164798a8b911c8c413617de230163b18528f9feef6d03ed5a4373ff999e2f2 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..1bbe48f06df27c85a99afbf3f02b220160f7906c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1a243f5d0b7d6614fbd2c8c8ab8bf2372f1567a7a165c30ba4d4b8db05c00c4d +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..7c168ba589ab149907f65c12980a55da76890995 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:02f02c3c7264962c7bbb05c73c2c2f9530a34cf2c29d550cdc787ae19eb6d9bb +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..79316819133dfc6dd374dbd7c6e335bd210220d7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2c45d391a5b10cf421d2d613e1869cf5682674cea9241e799990f6162a8967fd +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..15d50b0a8c1e8d718b36e58935265f0852487e04 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/trainer_state.json @@ -0,0 +1,663 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.6, + "eval_steps": 10, + "global_step": 420, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 6.88215023419392e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-420/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..698dc87a3868c32de5146952905f340fd3d8a54a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:270a620b7de0db432206b1ff5a7f8c9af0d15bb671b3ae4d83130d7ab78f7494 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..c2911684c55e3630fe8ef09a55f22001a993337a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ed85c0736f07347a318ec2236ee47e4067b5981a7bdd859bbf236196779037ce +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..eb08c850753d158caff59458c0a4d2fa22ad5de8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5f5c1faf0e9eb010c64f51b35236463635709da903fff7194839666558e862b6 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..bc6519c3fad5713aa55f6e2dc2df85ffab921f35 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:794dcff7f2ecb87626b523d3658c7cd77bc71a3da8b1a747ce331847f71fcf83 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..c69f541076c12540c28f202888021dffd063b197 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/trainer_state.json @@ -0,0 +1,678 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.733333333333333, + "eval_steps": 10, + "global_step": 430, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.04601095405568e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-430/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..f813eb52c8d758d9546296d241a40d167053e3d1 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7e2499899634609d7996cee470665919cab717bd6d35c578d846c5f23955bd2f +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..8d755497be6b694a1d9b637bbfe5d9f99b4f4d08 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d2f3d1b63d6cdcaf0f25a50121b9125b052c20057ecde5cc4e4ea1286745e871 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..5fdc5e50e381540856fecccc6c375074d1aa7b0a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:54abee51bb88479cda4bf77e85c2a545e7fb3c5e42f56d1baa63f1344dcc0529 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..cde3e45656e3bf8fa17fcf8906543903ca4b6d49 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a743260ed11c2e2b34343c388955b60e9f81763136039ef739f903addd1ae49c +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..d7520f58a1f375ec0b20f6d45e04a861bfc70d6a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/trainer_state.json @@ -0,0 +1,693 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 5.866666666666667, + "eval_steps": 10, + "global_step": 440, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.20987167391744e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-440/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..c164bece4fa345d0a054ef9e08484bdc0aab069d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:35d265d4db540a84ead2c5c83cc154d0f6251840f105680bd1196f47ec0715a7 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..e5b7bc5de9cca0c7cb8ffaa5d0654dc522a30a82 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:06287975b887b70c1e02ccdc8e85039052f8d791b7f035513ec40ef21cd71a99 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..3e7c44b011328e871a23ca1fea7cc6ea78d70a29 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4cc0a8131f9f14b855b33975c5e795a94be3a332a0f3cf68a9ec3ab6ce73b177 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8f99a8a8aca6f7065ba77b9d9a24e7340215c710 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c9f4ce70d201a3eee6e6e595904777ba65200b8654c193b25c8df60de453acf9 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..ff7b30be6419a1b97ce69aaaede9aa106a0dacd6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/trainer_state.json @@ -0,0 +1,708 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.0, + "eval_steps": 10, + "global_step": 450, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.3737323937792e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-450/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e63c3584b9c6ad7dcae2068676a48f22ea5ab8db --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:99519c74e0e5d7a78bdccbe5f9db50ede934be31802fe5aaedcee30c0be30b3f +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..e5908b1bb2ea4de946a5f9e4c91c59cd821c2f8b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5ea638a4c0041ac90e4b111cfa1b4a5c27154823911a7966fcf5e8309bec014d +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..82f7415495fcd1c3ffb5dae79c8c3a4c2269faa6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9a6424cc1a4d391795fbea6a94823363dca21ce0e7ec6c433e8cb5b0aca0060f +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..542e60faa7528741b10f09fb841fdd20a67f2908 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bb750597db574629acb2818000a42851f3a160d1335190de1b3f61a35e4636e2 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b651ddd94a6e695dc952daa12fdb6095bf40100a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/trainer_state.json @@ -0,0 +1,723 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.133333333333334, + "eval_steps": 10, + "global_step": 460, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.53759311364096e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-460/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..8f3ef250cac7866668956cef81a00aec3e400f5a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2f5a048576fdfcc49d425956b8f9b09d67c186de17f933374224cefe84559363 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..1211bde9e074444f81f7fa871b05c013a59c56bd --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1151cfa7ea030b98240fed97d8b1910246c7612c6abcd05fc745b309e2aebfee +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..84ca1f63cf231e2aa1c43b465c46ef11c80bc867 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:03fc4a1860f68759a4d7833f4317681e377d4e71cf91ab1f091da8cd71579d26 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..015e6dfa049b8660150ab74dc3e29c51ee9a70f5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:64a1d8883ed790c5a13fdb8cb1cd5e8fdce81458d148c2cf5add183af2232b77 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..a276a178f60f303f41c3f39900eca49d1341ede1 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/trainer_state.json @@ -0,0 +1,738 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.266666666666667, + "eval_steps": 10, + "global_step": 470, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.70145383350272e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-470/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e579c50618e395289b732b0c33f6972aa736af3a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:63f82674b7fa01d79463411fd4a23662285c3bc14e7f3354b4b69e57fb2ddcdd +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..910a0af896173e66f67a43d01ccde2fe43f52ae5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d27d8de7230d38b52fb4d136c7d9a8b826a906f54112d53d07955e6df3cf5771 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..302025be6f88ae472170fe5d230ba39d4ec976df --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:918d6ec8ede8d7a880512e2fc44b16d7c22df85e8b411a004d142edcf446c40d +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..23a8e29c0e759fd8cbb591ad1e95c627a9a2b075 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2f792a5e7967791cdd054d2d37994a055a027771a2408f509dc4ba9ab342dd26 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..ee554dc42e079064884218069192f8d9d115ed1a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/trainer_state.json @@ -0,0 +1,753 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.4, + "eval_steps": 10, + "global_step": 480, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 7.86531455336448e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-480/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..d26c7ee25d6602c267c4bcb7b64a07eb18a85e40 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:34de456ad978b5915c6a6e5216de375929f1674e2b3095dde003197e1f0916de +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..9d90c21141b012f65f9ef3605a64f3b8167102e0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3f21b10f9be30423aa6894814073e1bb161dfc5890dd5b4c660042021d5a9d29 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..031b265de35950a615eacc2c86e46292f552e541 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:b56a3ff26dded8216d560cf73ba4817b5973851b78edbbf6aa9d6b515761df8c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..ba8d4c3d9df38a2a4f0b799cde3a24c98ed85d27 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3c5c905bef5a70736b808f65f5713cbd8c5995bd14fee4e2222076f7b4140ed7 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f9f3fd35c4c398d229278fa2962c7544b954e8e0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/trainer_state.json @@ -0,0 +1,768 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.533333333333333, + "eval_steps": 10, + "global_step": 490, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.02917527322624e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-490/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..7092abe02222a764e40d1c215e111fad89b4d48f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:25938354032adc89fa0c600e621b21adb829cdcc850ff9cc793df15d43333011 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..179e1f5fe6429cd0b2fda4293de5ead20e3afb39 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:455cd1cd9adcfd71e7e32435bb2ba693afdc1218727a44ab81318b40ca2efbb2 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..c1fc54eb4786e9f15244e8e4274b14688b87da5d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7062fa0264c6fb17100531852b46c235ce631a6626d5e19749a65ba8723532c0 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..cee24f7781db565e483521e84ddc6dd277a07ef3 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8f79415c3ece613ed89d676bff22f42086790a2bced0de6758824fb8c7e27fcc +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..cb12aaf965c255bc21b9166f5b0aa67999255f48 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/trainer_state.json @@ -0,0 +1,108 @@ +{ + "best_metric": 2.4452807903289795, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50", + "epoch": 0.6666666666666666, + "eval_steps": 10, + "global_step": 50, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8193035993088000.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-50/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..bc25275848a3173a1ef70606eefa76bff2eec0b1 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:37e4d71aeff159a736ce0da5671d9e0eb2cdfb8b49de41c9f4c8c4b11454ce73 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..cc6de2eea19ea5b8b96a483768d03e2abf515a3c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2913b25f9ab67a930839df29894bff9d03199450242d80f0fd3006c3edd738fa +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..96edd96602542afab3935d537c8d1428ce43196b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:beda198a64f1e6f1db0895ff6a6859c2af4c98fbf9c15d1daa4dcca9c20f50be +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..978bda0016f9cc19457f5905f88394b6d0e1174d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d2408ca43d918f2b0acc523127b0955788c28cb5681dd00cc03149b51ca53832 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f468c465ba6b8da534256a898fd12794deb0fce5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/trainer_state.json @@ -0,0 +1,783 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.666666666666667, + "eval_steps": 10, + "global_step": 500, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.193035993088e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-500/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..163d617bac5cd6af63e57a1eafa7734d18935cb9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:edf01b8edfa873ddcc8a0a87e55b2561ba57fd046be96bb6333116061c2c2bb4 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..344fe52a38102dcc8c7468e78fac9968ce99684d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8bcdac3ec036675522dd8fe78b17873c108f9b8f2dad391a48004b3b31444dd1 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..52b85f2bd42c764f793cd9aa8382577ad1b51617 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:156b16fe2af6b1592b431fe36919ba4914ab9e672f318f884f5045be66654277 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..de33195460f9848f6f9d7f8d587f0fd570e7b0d8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d5dc4ff0a0a28b94a6edd0ed1f658c56b33546529d31a1dc50aa0da26a4d21af +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..dd17e5923e8b9d574bd91a6bdb447d504027a261 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/trainer_state.json @@ -0,0 +1,798 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.8, + "eval_steps": 10, + "global_step": 510, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.35689671294976e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-510/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..5115a6d3cf8b47ef66e873f70f0be09e4da0edc2 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:daf10f2d85bb55b55fc6c28442cafc717407d4fb9f98821893e61a349f43d401 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..296ca2ecbd4441bae910533bf37ab1989670ebcd --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:136cb8152bddf90fc959fc32bae41afa6a06575f8f5462ee53867203467c2145 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..736afdcce42e3e1d5dec3aedeed239bc0b63975c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ca29f15bc2264125f00923607dbea007ec921af3e528271a2bb77db5cd4d2b66 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..dcc8e4188d637aab8601fa04eebce72b72a90468 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9a81343523c3ab8eebb89d7ae0acfc8701ca74d9022c986afd66e744f282bebd +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..87c8910c1ba6e1c44098e65152a69295acff3428 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/trainer_state.json @@ -0,0 +1,813 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 6.933333333333334, + "eval_steps": 10, + "global_step": 520, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.52075743281152e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-520/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..aa67048ccd8cfd0918abb85040c69a7603127ee2 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9ec75bbe33b0ef84884a21fd90906e28c7b303f5f4b03d46ff2962bc4205d023 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..fd7389d19fbfc589fcb03bd6f6170d990de0a5fd --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d64c2af24a424b5b912fd131fbbff05f89c165ddf0d2da6efab1552b88549ec0 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b0413aa128dc89fb63c7a74242ac1a6da3ecf5bf --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e9436217a6dd3838565d7b9845d97ff2e933eb514cc6ac99465ebc3448de3312 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..6d1511b1f9e2034c73718246fbc85329071cd804 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:92ff35b0fed4e581f56ac25563eb2b5094a7a8b0870dbaa3310f55d76f7d4c11 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..a9b15c74ba6f380fa528ba66758a213ee3a68a8c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/trainer_state.json @@ -0,0 +1,828 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.066666666666666, + "eval_steps": 10, + "global_step": 530, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.68461815267328e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-530/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..320971a7f4c6bdb8cd5d1ac760fe29e23bd5f7c7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:365d6dd48c8ce6ec13b4a9a4024a1857435e81058921bd8d09c07f5c3e332cad +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..e280724672522dc8135e46dc4098492a279f02ec --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e10953c248d4abe6d3d80042a96073440639b8bfdc33f52a553eece408a0ce4d +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..8d48caf21e655a01d7675a2b465c934cea676943 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:816bfad4f86e01da7fe3bd5bf7d10c902cf135a5b5fec9e0170158290fe5828c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..7a88ab3ce5f087f1956302514b70910f70352c1b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1a5d5a2ec392db9a81b2192706ae839ceabb8bcb73f41cac559f4828ad9fee3b +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..cab870c2a6ddc9e6eb0de4cba3a257ce16aff809 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/trainer_state.json @@ -0,0 +1,843 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.2, + "eval_steps": 10, + "global_step": 540, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 8.84847887253504e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-540/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..9124292b5a0e19e6eb6ed6b5c1abfc4830d92c73 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:71a91b9979c6fa97eb171907865f3941a48fe38b699d2b108c1fbe6c47a1e093 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..64a4e50c6d7f5d3da3305c5925d967a806e6695a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2c877a4059ae296d857ef41c6230508b97270a6f98c30a185ae3e654362a322d +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..9dc1ec111f2a6f7fbe8d878013e83df65b5f618a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5b6faa8c50c89ce52c86274c8c795afb3f00524e7aef4544572df4b5b6b12c6d +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e1eff64c0ca4a6dfdaf3ecce0432ab852640e2a2 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e236c54cbe3accddf3d5520ba0cb518b40e8abd35f271bf5f14f28e45d7fcbc4 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..cf529eaae90703c73f582b9aa05d86b072c3a354 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/trainer_state.json @@ -0,0 +1,858 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.333333333333333, + "eval_steps": 10, + "global_step": 550, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.0123395923968e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-550/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..73eb487da4a57f7a158905c288d8dbe6d04dcef5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4a4d15a9cf35cce29391e3c44071017f84ed4f3e63d4fca70166f6c28faac844 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..346655d23ebc92f3c9eeb8438378c3ec3b7191ec --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:7eea7ee58b687fc1dfbf3176d282ccd8af36d3b65cc468165be453a13ae13934 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..75311ff97c8628cb71fe6f6cdca5e9e1127d30b6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3b6745ab2a92f54dcacb73c3ceec9d54235e5b225134fb7703879ee6185ad897 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..5756f6c9fef12c1a43ebc667c4ef5157189283d7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:81fe9b127854ebce428baec36760ed1aae5422f8137541e4e087f9befc05df49 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..b96afbf65a9528b4b71608ceaa100a67bf78625e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/trainer_state.json @@ -0,0 +1,873 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.466666666666667, + "eval_steps": 10, + "global_step": 560, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.587251901626587, + "learning_rate": 1.4222222222222224e-05, + "loss": 1.6533, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.895568370819092, + "eval_runtime": 43.8305, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 560 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.17620031225856e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-560/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..668b4180dd1ab282ca0e6048ff3ed3963184d28f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5187c7f892d9759f8e438befed6de4e119865ebf01a63d7f04b705bda1a28327 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..b5785e08dd55bfbdd7abbb19b328870739ed7225 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d4ebe13ac684c716ab98263b6779c7b634c6f0cbbdceadf1dabf163db1a1cbcd +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..3ed38f9a78b3dbf6f2e73e5bd68681ac198b1983 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d966d92a47b281ed57ee7f44ee2eaa60a54786f7ca9b7e8829ab8723bc8a5a1d +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..67a81d380acccdb956aff7d7d26ca4c653be4c25 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:48b02389d16963a10b13b73ed0bc8186a660fb6c3a3f111137738be693d308ee +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..07623b0b146f529f91720c3ca40f0a95d47bde23 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/trainer_state.json @@ -0,0 +1,888 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.6, + "eval_steps": 10, + "global_step": 570, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.587251901626587, + "learning_rate": 1.4222222222222224e-05, + "loss": 1.6533, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.895568370819092, + "eval_runtime": 43.8305, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.345590591430664, + "learning_rate": 1.303703703703704e-05, + "loss": 1.5974, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.890951156616211, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 570 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.34006103212032e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-570/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..ec7145d904d0e1ab416875a4c87c5a34001b2f43 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:31ca59e4c1679ce67d515ca9b6e8d0a6b19c99bf26dfb4e2aae69acab552fd63 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..98bbca7546a3e2a0e3e0bc22c66f5714624dfb24 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:eec1adb19789279739e313ccc12427fd1e715aa28b8c6328a71d26a36e8c6737 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..6f12baaba3ec135e726e0b75dc20ee8cfe8a995d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:55a6ddc6425602c9554969e2910a1ee66847f95ab8fd86352843e16c6530b2c0 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..480817943f7c3179c952be34878de231571519b7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8cb626c0c06a34c0f16881e0626ef50f89e734a493b0639f701b1d498ac5803b +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..dda3912ada3c41f03bfc686cfb5cb9793ac11d75 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/trainer_state.json @@ -0,0 +1,903 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.733333333333333, + "eval_steps": 10, + "global_step": 580, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.587251901626587, + "learning_rate": 1.4222222222222224e-05, + "loss": 1.6533, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.895568370819092, + "eval_runtime": 43.8305, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.345590591430664, + "learning_rate": 1.303703703703704e-05, + "loss": 1.5974, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.890951156616211, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.685967445373535, + "learning_rate": 1.1851851851851852e-05, + "loss": 1.5801, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.891247510910034, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 580 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.50392175198208e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-580/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..d53fe37df9dd51e1a1bc2b01d7889f4983c0bdcb --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d88800df6d13f2c31cc9df3f510fe9eb4cdb410c3da85df3db3126b4e956065a +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..dc1a2c464f14c83188e49bde291881df26c23eb0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6e8c10c803c5dbd08417e66736d1a8ee4ad6f8ac390aff12af35563bef526f91 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..f2cbe02e4922a4920c0a827f09f6df580967beb0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c5704b322a17ce5b2788c1247543e3ca9edc36d083fd8ecc8ca80d04334c6030 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..5a572db6bb4d69f7faa64c6246337520c49bc7c4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:35faa867d55ab0c124a6e3178636b9ef658fb376d7a9da849015b46c0e68ef82 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..11d1593fed1c3a2ca7755adcb1a3a27a816631c0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/trainer_state.json @@ -0,0 +1,918 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 7.866666666666667, + "eval_steps": 10, + "global_step": 590, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.587251901626587, + "learning_rate": 1.4222222222222224e-05, + "loss": 1.6533, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.895568370819092, + "eval_runtime": 43.8305, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.345590591430664, + "learning_rate": 1.303703703703704e-05, + "loss": 1.5974, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.890951156616211, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.685967445373535, + "learning_rate": 1.1851851851851852e-05, + "loss": 1.5801, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.891247510910034, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.663407564163208, + "learning_rate": 1.0666666666666667e-05, + "loss": 1.653, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.8887994289398193, + "eval_runtime": 43.8396, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 590 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.66778247184384e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-590/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..12e479a6922190bee60e80ad8323a1e5a833e645 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1aca9f54bf932a20e1246eb8bcf204dce195b42171bdb0fac19d61b1e7cf497f +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..f37b4340569d3696c08bea1aef9f7b422394b323 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a6ce6358bbf8d49e7e471e67820674c6401888e387a763356aa1819e6de264b7 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..3d041c10a3af80c2be01488b87e7c23a107acab4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:224b98cd2a3813f8f156af229101dde99ced2e24294f3d7ad7b1538fdc49c27c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..e35866f32db88c57fbcc281885df929786abae39 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:db64dfcaaa6d2770fdeb8c6c250f6efda7e6b2cbc236d50bf153703fcb63ac50 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..31d76e2bf96073c6d712b1355288777072ae495a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/trainer_state.json @@ -0,0 +1,123 @@ +{ + "best_metric": 2.443875789642334, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60", + "epoch": 0.8, + "eval_steps": 10, + "global_step": 60, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9831643191705600.0, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-60/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..269a32a0c21484dec3c60cfc1c73c73c4a2bf036 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a0e54b38b7fe42c084bdba77b5c431ff7f20130b4eaa6bd1e29b8b13dbdaf6a7 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..f5aeb76945dd4ee544a8395dbc9b654f05a1105d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:25520a7df2a2249fcde4dd089e8f12f1e3686e796f771ceb59a94c6cfda42b76 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..ef40b259bc3233779099c3b8651c2fe0a9d07fa5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9bbc772ea5a37ab482a5fa0d13a2014584215ee3da6246ff6fe50fb8dafbfb8e +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..07828ebdc4bf4ad63c158d34e9e4d66b2b692658 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:83101977bcee09ea8757fcdca77b104a26e82543160a4a957d641f71c49be670 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..67834c40ebf6e849503180ac9a9a0a51ddd5105d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/trainer_state.json @@ -0,0 +1,933 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.0, + "eval_steps": 10, + "global_step": 600, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.587251901626587, + "learning_rate": 1.4222222222222224e-05, + "loss": 1.6533, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.895568370819092, + "eval_runtime": 43.8305, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.345590591430664, + "learning_rate": 1.303703703703704e-05, + "loss": 1.5974, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.890951156616211, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.685967445373535, + "learning_rate": 1.1851851851851852e-05, + "loss": 1.5801, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.891247510910034, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.663407564163208, + "learning_rate": 1.0666666666666667e-05, + "loss": 1.653, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.8887994289398193, + "eval_runtime": 43.8396, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 6.289735317230225, + "learning_rate": 9.481481481481483e-06, + "loss": 1.6507, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.8898768424987793, + "eval_runtime": 43.8414, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 600 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.8316431917056e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-600/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..caf7257f3cbd5d40b77b65b5ee8c0d7a0c706617 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:99b8451f731a6eba091a1756518cb8832a04217badf0e09bc052325b5d45687c +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..d93fec5237fbdaff093c5a9e3326cb0cb745ff21 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f0003a96b90cc580302073fdbfc82d28255492817e75635691935fec84948eec +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..6a970899a5edc16268fdea83560e0495a3d06810 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fa5b53289977451ca52671d3897055616936322daf22f6e4246ff72a467aef1c +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..bc3e541f5b02e78c5f96597be4032a6bbab995b9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:43e3bdd381ebde864670b497956913f5b3f47f46ff70732cd2c4baf55c0c2663 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..74b2f4912f8960691b9cdb9ed74490a1fa70525e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/trainer_state.json @@ -0,0 +1,948 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.133333333333333, + "eval_steps": 10, + "global_step": 610, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.587251901626587, + "learning_rate": 1.4222222222222224e-05, + "loss": 1.6533, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.895568370819092, + "eval_runtime": 43.8305, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.345590591430664, + "learning_rate": 1.303703703703704e-05, + "loss": 1.5974, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.890951156616211, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.685967445373535, + "learning_rate": 1.1851851851851852e-05, + "loss": 1.5801, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.891247510910034, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.663407564163208, + "learning_rate": 1.0666666666666667e-05, + "loss": 1.653, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.8887994289398193, + "eval_runtime": 43.8396, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 6.289735317230225, + "learning_rate": 9.481481481481483e-06, + "loss": 1.6507, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.8898768424987793, + "eval_runtime": 43.8414, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.618720293045044, + "learning_rate": 8.296296296296297e-06, + "loss": 1.514, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9118874073028564, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 610 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 9.99550391156736e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-610/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..c3d5f829861470f4c248b56bb8f1785ab96c84ab --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e49337198fc08a8aa2c30b7201b3ee3a35937dfbff70534a424893362759e776 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..16ce93c7bf884fd9c5c00eae5b575f11cca34382 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fe98c060964256cf09d312b60b6f1cfe4781c7673c6bd00359f6f46e8a2fa265 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..da7e5f0f7045f8fad1c1529974e555cc67b8f5f0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f2b2ce429e00eba0165cdfd527b7ca384fed68ae5660561d0cbc6dbdd51ce7f1 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..7805ebb9c2ef03332629fed5505c853ef805acdf --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6da8bbf746b93799ac402b356556be20882200699a05ee0cf87b32494b6ac8ec +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..bf233c259529fb1bc2746d98aaf74eb8cc26cd1a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/trainer_state.json @@ -0,0 +1,963 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.266666666666667, + "eval_steps": 10, + "global_step": 620, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.587251901626587, + "learning_rate": 1.4222222222222224e-05, + "loss": 1.6533, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.895568370819092, + "eval_runtime": 43.8305, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.345590591430664, + "learning_rate": 1.303703703703704e-05, + "loss": 1.5974, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.890951156616211, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.685967445373535, + "learning_rate": 1.1851851851851852e-05, + "loss": 1.5801, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.891247510910034, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.663407564163208, + "learning_rate": 1.0666666666666667e-05, + "loss": 1.653, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.8887994289398193, + "eval_runtime": 43.8396, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 6.289735317230225, + "learning_rate": 9.481481481481483e-06, + "loss": 1.6507, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.8898768424987793, + "eval_runtime": 43.8414, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.618720293045044, + "learning_rate": 8.296296296296297e-06, + "loss": 1.514, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9118874073028564, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.8645944595336914, + "learning_rate": 7.111111111111112e-06, + "loss": 1.514, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9340760707855225, + "eval_runtime": 43.8347, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 620 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.015936463142912e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-620/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..91819b3cc2943603bbfe7ec7c5d865bad9e630e7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ebf5a4171e4d287954f23855444768a49937a65eca4cabdec11a1196be3c4b99 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..e8a0a3a7442e16b0ac178ef7e66bfcaa6980d54b --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:13f9997dd22f68d377a44f21a150ff44935e88469e2b455a9587be1f6c6b34fe +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..96d7a3f6be074e46014211fae837a521e5c5140c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dd6c4f62bed5401eddcf930d960632a48c624bea715ca64cedd7d04db198b4a0 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..0bdcffb0ae6cec4a396a52ae5b4383f83b898d44 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:14ee9bb2b5946aa2a0a572f5638742b652ef249cae6e054056e7f3d66777fd09 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..bc221eb0db212b09323cb175f3acf96b47a9b4d6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/trainer_state.json @@ -0,0 +1,978 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.4, + "eval_steps": 10, + "global_step": 630, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.587251901626587, + "learning_rate": 1.4222222222222224e-05, + "loss": 1.6533, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.895568370819092, + "eval_runtime": 43.8305, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.345590591430664, + "learning_rate": 1.303703703703704e-05, + "loss": 1.5974, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.890951156616211, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.685967445373535, + "learning_rate": 1.1851851851851852e-05, + "loss": 1.5801, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.891247510910034, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.663407564163208, + "learning_rate": 1.0666666666666667e-05, + "loss": 1.653, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.8887994289398193, + "eval_runtime": 43.8396, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 6.289735317230225, + "learning_rate": 9.481481481481483e-06, + "loss": 1.6507, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.8898768424987793, + "eval_runtime": 43.8414, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.618720293045044, + "learning_rate": 8.296296296296297e-06, + "loss": 1.514, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9118874073028564, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.8645944595336914, + "learning_rate": 7.111111111111112e-06, + "loss": 1.514, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9340760707855225, + "eval_runtime": 43.8347, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.6941475868225098, + "learning_rate": 5.925925925925926e-06, + "loss": 1.519, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.933932065963745, + "eval_runtime": 43.8419, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 630 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.032322535129088e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-630/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e37143d79c20d90fc39b76e98b2e46b00a5a65d5 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:66e37db9551d85600e0abd5abd34393cb31f6d35ee989b1b7aea91b26c873f96 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..5895bce4b440ef06b508b8cfba52e4b28130d47d --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:56459c99a3306d963693ae19fe271b8e91ec5707bd4891d7e02981dde6a0af79 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..bc02fa7e506af341c87e94bd62a6cbdfbd057096 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0597f3b9ac321e002676eb1712670348770197d9b197cdd7a7e16f465315444e +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..eb13ec5343dc414392944cec9b2a0813557bc05e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ec3ab92bb9031512ea9a383265538790ae372d9fc6fdc8cbb317adadbc69ff04 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..bd6bb61fa08659fcb0a96e1625eee08be3e619f0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/trainer_state.json @@ -0,0 +1,993 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.533333333333333, + "eval_steps": 10, + "global_step": 640, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.587251901626587, + "learning_rate": 1.4222222222222224e-05, + "loss": 1.6533, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.895568370819092, + "eval_runtime": 43.8305, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.345590591430664, + "learning_rate": 1.303703703703704e-05, + "loss": 1.5974, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.890951156616211, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.685967445373535, + "learning_rate": 1.1851851851851852e-05, + "loss": 1.5801, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.891247510910034, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.663407564163208, + "learning_rate": 1.0666666666666667e-05, + "loss": 1.653, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.8887994289398193, + "eval_runtime": 43.8396, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 6.289735317230225, + "learning_rate": 9.481481481481483e-06, + "loss": 1.6507, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.8898768424987793, + "eval_runtime": 43.8414, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.618720293045044, + "learning_rate": 8.296296296296297e-06, + "loss": 1.514, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9118874073028564, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.8645944595336914, + "learning_rate": 7.111111111111112e-06, + "loss": 1.514, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9340760707855225, + "eval_runtime": 43.8347, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.6941475868225098, + "learning_rate": 5.925925925925926e-06, + "loss": 1.519, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.933932065963745, + "eval_runtime": 43.8419, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.8782618045806885, + "learning_rate": 4.7407407407407415e-06, + "loss": 1.5587, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.9295082092285156, + "eval_runtime": 43.8426, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 640 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.048708607115264e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-640/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..36ce62a0d7c6d72f5b45c5276acd76d902b89305 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ccd75993a47acf10871b789dcd570352ac9346bd93d75d7735b98d28296cc5dc +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..c0bf31b14a187f9a354a730a0703f29b86c2ae61 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:763de3bf194bb407ae0510d5bd89c2a39393f0c01aa911dde8fc2f2c5c6a14da +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4d763156eb3a586b51733d4ec683a815a6ae5fab --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e66e316bd2615a5005aac13970f8b8e71830843ea716191e53ff7dc38997af08 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..538c330f52b750df6b61e74582431fa3c86c56fc --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:305fa9d76c8c633a785aa3f99ea69f568062a8f51e8b9922f48b53086b195e34 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..85735d7cd666568e35f17d2069873fc69b1e4a5f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/trainer_state.json @@ -0,0 +1,1008 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.666666666666666, + "eval_steps": 10, + "global_step": 650, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.587251901626587, + "learning_rate": 1.4222222222222224e-05, + "loss": 1.6533, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.895568370819092, + "eval_runtime": 43.8305, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.345590591430664, + "learning_rate": 1.303703703703704e-05, + "loss": 1.5974, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.890951156616211, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.685967445373535, + "learning_rate": 1.1851851851851852e-05, + "loss": 1.5801, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.891247510910034, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.663407564163208, + "learning_rate": 1.0666666666666667e-05, + "loss": 1.653, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.8887994289398193, + "eval_runtime": 43.8396, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 6.289735317230225, + "learning_rate": 9.481481481481483e-06, + "loss": 1.6507, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.8898768424987793, + "eval_runtime": 43.8414, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.618720293045044, + "learning_rate": 8.296296296296297e-06, + "loss": 1.514, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9118874073028564, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.8645944595336914, + "learning_rate": 7.111111111111112e-06, + "loss": 1.514, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9340760707855225, + "eval_runtime": 43.8347, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.6941475868225098, + "learning_rate": 5.925925925925926e-06, + "loss": 1.519, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.933932065963745, + "eval_runtime": 43.8419, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.8782618045806885, + "learning_rate": 4.7407407407407415e-06, + "loss": 1.5587, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.9295082092285156, + "eval_runtime": 43.8426, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 3.9059834480285645, + "learning_rate": 3.555555555555556e-06, + "loss": 1.618, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.929395914077759, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 650 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.06509467910144e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-650/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..38c6ec7157da400d7a0f2122a7bdcd32a9efce7e --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a616f3950af5ed3c53ecd1b33f1ba3a5857345dd20f39441f76659943007b2e3 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..831ed1d8cfd0d021ca2b75d347a08eeabd4c4043 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5a6881fba313eb884f9eb32bc4e83c5b9378b7c9bdd5b417baa6316043792fa5 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..1bd0e24dcfea6867dcdb66e0b90f3344dbd9d339 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:66fa7ea9452d536e82e5c18c4a0a05615143763aa569d9af13553a06a11128de +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..48f0df4dfa3298865154a751b343810e40adc501 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6c9a36d3bfc95fbbc7c0f031e956c7e22ccefdaa3618f18cceaf4a5c503033d3 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..f6767aa3600a3b84100e705f09266bcb24652fcb --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/trainer_state.json @@ -0,0 +1,1023 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.8, + "eval_steps": 10, + "global_step": 660, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.587251901626587, + "learning_rate": 1.4222222222222224e-05, + "loss": 1.6533, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.895568370819092, + "eval_runtime": 43.8305, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.345590591430664, + "learning_rate": 1.303703703703704e-05, + "loss": 1.5974, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.890951156616211, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.685967445373535, + "learning_rate": 1.1851851851851852e-05, + "loss": 1.5801, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.891247510910034, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.663407564163208, + "learning_rate": 1.0666666666666667e-05, + "loss": 1.653, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.8887994289398193, + "eval_runtime": 43.8396, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 6.289735317230225, + "learning_rate": 9.481481481481483e-06, + "loss": 1.6507, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.8898768424987793, + "eval_runtime": 43.8414, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.618720293045044, + "learning_rate": 8.296296296296297e-06, + "loss": 1.514, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9118874073028564, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.8645944595336914, + "learning_rate": 7.111111111111112e-06, + "loss": 1.514, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9340760707855225, + "eval_runtime": 43.8347, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.6941475868225098, + "learning_rate": 5.925925925925926e-06, + "loss": 1.519, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.933932065963745, + "eval_runtime": 43.8419, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.8782618045806885, + "learning_rate": 4.7407407407407415e-06, + "loss": 1.5587, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.9295082092285156, + "eval_runtime": 43.8426, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 3.9059834480285645, + "learning_rate": 3.555555555555556e-06, + "loss": 1.618, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.929395914077759, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 650 + }, + { + "epoch": 8.8, + "grad_norm": 3.9509334564208984, + "learning_rate": 2.3703703703703707e-06, + "loss": 1.6203, + "step": 660 + }, + { + "epoch": 8.8, + "eval_loss": 2.92995548248291, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 660 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.081480751087616e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-660/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..9c8f31375b879a9e3f8d07c3d06cefc9df7445f8 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ab923d9ba0868870c4519419ac362260fffd9b663a9b784af95f20a5eaf30cf8 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..0a282ae7b5c57d95322fc1f804e6d604dbf5671a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:235b8bbad67a3844bc709fd0f7bec0367694ef5cb817473c52dda09eaa584d27 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..b50ed8357a00070f99a52843c3e3d150dbd5b1aa --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5bb0850ed44e50e4ccb2afc9aab9a80c17a31208454b069930105956f7f9a183 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..8a2dfbd30e870760e3ca9893b579f919a09b6871 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fe177485ddd5ed488e0df9716c937fd2ab33a0c2447be4d5cbf96e011f87d5c1 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..7382a222e1c47e8105b3efeef81622104e117ec9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/trainer_state.json @@ -0,0 +1,1038 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 8.933333333333334, + "eval_steps": 10, + "global_step": 670, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.587251901626587, + "learning_rate": 1.4222222222222224e-05, + "loss": 1.6533, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.895568370819092, + "eval_runtime": 43.8305, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.345590591430664, + "learning_rate": 1.303703703703704e-05, + "loss": 1.5974, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.890951156616211, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.685967445373535, + "learning_rate": 1.1851851851851852e-05, + "loss": 1.5801, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.891247510910034, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.663407564163208, + "learning_rate": 1.0666666666666667e-05, + "loss": 1.653, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.8887994289398193, + "eval_runtime": 43.8396, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 6.289735317230225, + "learning_rate": 9.481481481481483e-06, + "loss": 1.6507, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.8898768424987793, + "eval_runtime": 43.8414, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.618720293045044, + "learning_rate": 8.296296296296297e-06, + "loss": 1.514, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9118874073028564, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.8645944595336914, + "learning_rate": 7.111111111111112e-06, + "loss": 1.514, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9340760707855225, + "eval_runtime": 43.8347, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.6941475868225098, + "learning_rate": 5.925925925925926e-06, + "loss": 1.519, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.933932065963745, + "eval_runtime": 43.8419, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.8782618045806885, + "learning_rate": 4.7407407407407415e-06, + "loss": 1.5587, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.9295082092285156, + "eval_runtime": 43.8426, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 3.9059834480285645, + "learning_rate": 3.555555555555556e-06, + "loss": 1.618, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.929395914077759, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 650 + }, + { + "epoch": 8.8, + "grad_norm": 3.9509334564208984, + "learning_rate": 2.3703703703703707e-06, + "loss": 1.6203, + "step": 660 + }, + { + "epoch": 8.8, + "eval_loss": 2.92995548248291, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 660 + }, + { + "epoch": 8.933333333333334, + "grad_norm": 574.3383178710938, + "learning_rate": 1.1851851851851854e-06, + "loss": 1.5739, + "step": 670 + }, + { + "epoch": 8.933333333333334, + "eval_loss": 2.9305057525634766, + "eval_runtime": 43.8299, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 670 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.097866823073792e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-670/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..4f8578b58f3cc8d0d6c477df75fcd1096fec7cc6 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ef6689daa96452b8983967e9d4463dc5a1373790e3e998b814496bf2fa9e31d8 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..f679d05c46629fbd6fe84581e651d644224986b7 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:d23cff659d212ad594ac0888eea52be468b9bc19105e5c421268e63e53b561a9 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..bb61823d0d78956427b74dd1a3fc741ba1b2381f --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c44717b587bf877ea1a37c7f5747a93e45e34ce231c845a31a9b8a042ee22593 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..72ac93ee4249bf1220c3ed82f099c14ae0267a68 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:886b6be563b163a73eaac3a0ce905ce45ea5202bed173e897fec04ed18434edc +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..00864d75d69b592dbbef2a32915c38670c2792ac --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/trainer_state.json @@ -0,0 +1,1038 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 9.0, + "eval_steps": 10, + "global_step": 675, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + }, + { + "epoch": 1.3333333333333333, + "grad_norm": 0.5702881217002869, + "learning_rate": 6.83851851851852e-05, + "loss": 2.4312, + "step": 100 + }, + { + "epoch": 1.3333333333333333, + "eval_loss": 2.4479310512542725, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 100 + }, + { + "epoch": 1.4666666666666668, + "grad_norm": 0.6380157470703125, + "learning_rate": 6.720000000000001e-05, + "loss": 2.3555, + "step": 110 + }, + { + "epoch": 1.4666666666666668, + "eval_loss": 2.4494454860687256, + "eval_runtime": 43.8344, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 110 + }, + { + "epoch": 1.6, + "grad_norm": 0.653825044631958, + "learning_rate": 6.601481481481482e-05, + "loss": 2.3021, + "step": 120 + }, + { + "epoch": 1.6, + "eval_loss": 2.4513721466064453, + "eval_runtime": 43.8593, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 120 + }, + { + "epoch": 1.7333333333333334, + "grad_norm": 0.7331721186637878, + "learning_rate": 6.482962962962964e-05, + "loss": 2.3933, + "step": 130 + }, + { + "epoch": 1.7333333333333334, + "eval_loss": 2.451523780822754, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 130 + }, + { + "epoch": 1.8666666666666667, + "grad_norm": 0.7955519556999207, + "learning_rate": 6.364444444444445e-05, + "loss": 2.3014, + "step": 140 + }, + { + "epoch": 1.8666666666666667, + "eval_loss": 2.452890634536743, + "eval_runtime": 43.823, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 140 + }, + { + "epoch": 2.0, + "grad_norm": 0.8482790589332581, + "learning_rate": 6.245925925925926e-05, + "loss": 2.3977, + "step": 150 + }, + { + "epoch": 2.0, + "eval_loss": 2.453979015350342, + "eval_runtime": 43.8265, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 150 + }, + { + "epoch": 2.1333333333333333, + "grad_norm": 0.8355972170829773, + "learning_rate": 6.127407407407407e-05, + "loss": 2.1801, + "step": 160 + }, + { + "epoch": 2.1333333333333333, + "eval_loss": 2.4687201976776123, + "eval_runtime": 43.8209, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 160 + }, + { + "epoch": 2.2666666666666666, + "grad_norm": 1.1043310165405273, + "learning_rate": 6.020740740740741e-05, + "loss": 2.2575, + "step": 170 + }, + { + "epoch": 2.2666666666666666, + "eval_loss": 2.4910356998443604, + "eval_runtime": 43.8321, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 170 + }, + { + "epoch": 2.4, + "grad_norm": 1.167297124862671, + "learning_rate": 5.902222222222222e-05, + "loss": 2.2586, + "step": 180 + }, + { + "epoch": 2.4, + "eval_loss": 2.5020434856414795, + "eval_runtime": 43.8479, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 180 + }, + { + "epoch": 2.533333333333333, + "grad_norm": 1.1763638257980347, + "learning_rate": 5.783703703703704e-05, + "loss": 2.1824, + "step": 190 + }, + { + "epoch": 2.533333333333333, + "eval_loss": 2.4987149238586426, + "eval_runtime": 43.8286, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 190 + }, + { + "epoch": 2.6666666666666665, + "grad_norm": 1.1276201009750366, + "learning_rate": 5.665185185185186e-05, + "loss": 2.2198, + "step": 200 + }, + { + "epoch": 2.6666666666666665, + "eval_loss": 2.4994020462036133, + "eval_runtime": 43.8331, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 200 + }, + { + "epoch": 2.8, + "grad_norm": 1.30434250831604, + "learning_rate": 5.5466666666666675e-05, + "loss": 2.2126, + "step": 210 + }, + { + "epoch": 2.8, + "eval_loss": 2.504134178161621, + "eval_runtime": 43.8231, + "eval_samples_per_second": 22.819, + "eval_steps_per_second": 2.852, + "step": 210 + }, + { + "epoch": 2.9333333333333336, + "grad_norm": 1.2805118560791016, + "learning_rate": 5.4281481481481486e-05, + "loss": 2.2297, + "step": 220 + }, + { + "epoch": 2.9333333333333336, + "eval_loss": 2.504213333129883, + "eval_runtime": 43.8403, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 220 + }, + { + "epoch": 3.066666666666667, + "grad_norm": 1.273930311203003, + "learning_rate": 5.30962962962963e-05, + "loss": 2.1546, + "step": 230 + }, + { + "epoch": 3.066666666666667, + "eval_loss": 2.5187768936157227, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 230 + }, + { + "epoch": 3.2, + "grad_norm": 1.6285576820373535, + "learning_rate": 5.1911111111111114e-05, + "loss": 2.0298, + "step": 240 + }, + { + "epoch": 3.2, + "eval_loss": 2.5875661373138428, + "eval_runtime": 43.8356, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.852, + "step": 240 + }, + { + "epoch": 3.3333333333333335, + "grad_norm": 1.6739757061004639, + "learning_rate": 5.072592592592593e-05, + "loss": 2.0294, + "step": 250 + }, + { + "epoch": 3.3333333333333335, + "eval_loss": 2.57621431350708, + "eval_runtime": 43.8461, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 250 + }, + { + "epoch": 3.466666666666667, + "grad_norm": NaN, + "learning_rate": 4.9659259259259264e-05, + "loss": 2.0768, + "step": 260 + }, + { + "epoch": 3.466666666666667, + "eval_loss": 2.5923492908477783, + "eval_runtime": 43.8497, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 260 + }, + { + "epoch": 3.6, + "grad_norm": 1.8317577838897705, + "learning_rate": 4.847407407407408e-05, + "loss": 2.0638, + "step": 270 + }, + { + "epoch": 3.6, + "eval_loss": 2.589301109313965, + "eval_runtime": 43.8545, + "eval_samples_per_second": 22.803, + "eval_steps_per_second": 2.85, + "step": 270 + }, + { + "epoch": 3.7333333333333334, + "grad_norm": 1.8112515211105347, + "learning_rate": 4.72888888888889e-05, + "loss": 1.993, + "step": 280 + }, + { + "epoch": 3.7333333333333334, + "eval_loss": 2.587583065032959, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 280 + }, + { + "epoch": 3.8666666666666667, + "grad_norm": 1.7112412452697754, + "learning_rate": 4.610370370370371e-05, + "loss": 2.0502, + "step": 290 + }, + { + "epoch": 3.8666666666666667, + "eval_loss": 2.5876784324645996, + "eval_runtime": 43.8388, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 290 + }, + { + "epoch": 4.0, + "grad_norm": 1.838281512260437, + "learning_rate": 4.491851851851852e-05, + "loss": 2.0462, + "step": 300 + }, + { + "epoch": 4.0, + "eval_loss": 2.590817928314209, + "eval_runtime": 43.8463, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 300 + }, + { + "epoch": 4.133333333333334, + "grad_norm": 2.24188494682312, + "learning_rate": 4.373333333333334e-05, + "loss": 1.9831, + "step": 310 + }, + { + "epoch": 4.133333333333334, + "eval_loss": 2.6747002601623535, + "eval_runtime": 43.8564, + "eval_samples_per_second": 22.802, + "eval_steps_per_second": 2.85, + "step": 310 + }, + { + "epoch": 4.266666666666667, + "grad_norm": 2.203629493713379, + "learning_rate": 4.254814814814815e-05, + "loss": 1.8985, + "step": 320 + }, + { + "epoch": 4.266666666666667, + "eval_loss": 2.6817915439605713, + "eval_runtime": 43.8451, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 320 + }, + { + "epoch": 4.4, + "grad_norm": 2.337613344192505, + "learning_rate": 4.136296296296297e-05, + "loss": 1.8172, + "step": 330 + }, + { + "epoch": 4.4, + "eval_loss": 2.683194398880005, + "eval_runtime": 43.8437, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 330 + }, + { + "epoch": 4.533333333333333, + "grad_norm": 2.1497907638549805, + "learning_rate": 4.017777777777778e-05, + "loss": 1.8996, + "step": 340 + }, + { + "epoch": 4.533333333333333, + "eval_loss": 2.682828187942505, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 340 + }, + { + "epoch": 4.666666666666667, + "grad_norm": 2.2859954833984375, + "learning_rate": 3.899259259259259e-05, + "loss": 1.832, + "step": 350 + }, + { + "epoch": 4.666666666666667, + "eval_loss": 2.670285701751709, + "eval_runtime": 43.8444, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 350 + }, + { + "epoch": 4.8, + "grad_norm": 2.2450287342071533, + "learning_rate": 3.780740740740741e-05, + "loss": 1.9352, + "step": 360 + }, + { + "epoch": 4.8, + "eval_loss": 2.6795761585235596, + "eval_runtime": 43.8383, + "eval_samples_per_second": 22.811, + "eval_steps_per_second": 2.851, + "step": 360 + }, + { + "epoch": 4.933333333333334, + "grad_norm": 2.2327992916107178, + "learning_rate": 3.662222222222223e-05, + "loss": 1.8781, + "step": 370 + }, + { + "epoch": 4.933333333333334, + "eval_loss": 2.6721572875976562, + "eval_runtime": 43.8459, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 370 + }, + { + "epoch": 5.066666666666666, + "grad_norm": 2.1421315670013428, + "learning_rate": 3.543703703703704e-05, + "loss": 1.8686, + "step": 380 + }, + { + "epoch": 5.066666666666666, + "eval_loss": 2.707505464553833, + "eval_runtime": 43.8594, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 380 + }, + { + "epoch": 5.2, + "grad_norm": 2.9969141483306885, + "learning_rate": 3.4251851851851856e-05, + "loss": 1.7671, + "step": 390 + }, + { + "epoch": 5.2, + "eval_loss": 2.792614459991455, + "eval_runtime": 43.8503, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 390 + }, + { + "epoch": 5.333333333333333, + "grad_norm": 2.7963175773620605, + "learning_rate": 3.3066666666666666e-05, + "loss": 1.779, + "step": 400 + }, + { + "epoch": 5.333333333333333, + "eval_loss": 2.7597317695617676, + "eval_runtime": 43.8303, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 400 + }, + { + "epoch": 5.466666666666667, + "grad_norm": 2.831728935241699, + "learning_rate": 3.1881481481481484e-05, + "loss": 1.7724, + "step": 410 + }, + { + "epoch": 5.466666666666667, + "eval_loss": 2.7580933570861816, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 410 + }, + { + "epoch": 5.6, + "grad_norm": 2.6877756118774414, + "learning_rate": 3.06962962962963e-05, + "loss": 1.7872, + "step": 420 + }, + { + "epoch": 5.6, + "eval_loss": 2.7636563777923584, + "eval_runtime": 43.843, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 420 + }, + { + "epoch": 5.733333333333333, + "grad_norm": 2.7070910930633545, + "learning_rate": 2.951111111111111e-05, + "loss": 1.7438, + "step": 430 + }, + { + "epoch": 5.733333333333333, + "eval_loss": 2.763782262802124, + "eval_runtime": 43.8312, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 430 + }, + { + "epoch": 5.866666666666667, + "grad_norm": 3.0255019664764404, + "learning_rate": 2.8444444444444447e-05, + "loss": 1.764, + "step": 440 + }, + { + "epoch": 5.866666666666667, + "eval_loss": 2.756753444671631, + "eval_runtime": 43.8473, + "eval_samples_per_second": 22.806, + "eval_steps_per_second": 2.851, + "step": 440 + }, + { + "epoch": 6.0, + "grad_norm": 2.8314449787139893, + "learning_rate": 2.725925925925926e-05, + "loss": 1.8499, + "step": 450 + }, + { + "epoch": 6.0, + "eval_loss": 2.7551913261413574, + "eval_runtime": 43.828, + "eval_samples_per_second": 22.816, + "eval_steps_per_second": 2.852, + "step": 450 + }, + { + "epoch": 6.133333333333334, + "grad_norm": 2.9919869899749756, + "learning_rate": 2.607407407407408e-05, + "loss": 1.6757, + "step": 460 + }, + { + "epoch": 6.133333333333334, + "eval_loss": 2.829700231552124, + "eval_runtime": 43.8413, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 460 + }, + { + "epoch": 6.266666666666667, + "grad_norm": 3.257183074951172, + "learning_rate": 2.4888888888888893e-05, + "loss": 1.6556, + "step": 470 + }, + { + "epoch": 6.266666666666667, + "eval_loss": 2.8559584617614746, + "eval_runtime": 43.8592, + "eval_samples_per_second": 22.8, + "eval_steps_per_second": 2.85, + "step": 470 + }, + { + "epoch": 6.4, + "grad_norm": 3.2374985218048096, + "learning_rate": 2.3703703703703703e-05, + "loss": 1.6625, + "step": 480 + }, + { + "epoch": 6.4, + "eval_loss": 2.830960512161255, + "eval_runtime": 43.8325, + "eval_samples_per_second": 22.814, + "eval_steps_per_second": 2.852, + "step": 480 + }, + { + "epoch": 6.533333333333333, + "grad_norm": 3.171888828277588, + "learning_rate": 2.251851851851852e-05, + "loss": 1.638, + "step": 490 + }, + { + "epoch": 6.533333333333333, + "eval_loss": 2.8306448459625244, + "eval_runtime": 43.8501, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 490 + }, + { + "epoch": 6.666666666666667, + "grad_norm": 3.096437692642212, + "learning_rate": 2.1333333333333335e-05, + "loss": 1.7256, + "step": 500 + }, + { + "epoch": 6.666666666666667, + "eval_loss": 2.8334598541259766, + "eval_runtime": 43.8702, + "eval_samples_per_second": 22.795, + "eval_steps_per_second": 2.849, + "step": 500 + }, + { + "epoch": 6.8, + "grad_norm": 3.2644155025482178, + "learning_rate": 2.014814814814815e-05, + "loss": 1.7346, + "step": 510 + }, + { + "epoch": 6.8, + "eval_loss": 2.8344273567199707, + "eval_runtime": 43.9437, + "eval_samples_per_second": 22.756, + "eval_steps_per_second": 2.845, + "step": 510 + }, + { + "epoch": 6.933333333333334, + "grad_norm": 3.287597179412842, + "learning_rate": 1.8962962962962966e-05, + "loss": 1.7282, + "step": 520 + }, + { + "epoch": 6.933333333333334, + "eval_loss": 2.839320659637451, + "eval_runtime": 44.0144, + "eval_samples_per_second": 22.72, + "eval_steps_per_second": 2.84, + "step": 520 + }, + { + "epoch": 7.066666666666666, + "grad_norm": 3.069183349609375, + "learning_rate": 1.7777777777777777e-05, + "loss": 1.563, + "step": 530 + }, + { + "epoch": 7.066666666666666, + "eval_loss": 2.8437397480010986, + "eval_runtime": 43.9934, + "eval_samples_per_second": 22.731, + "eval_steps_per_second": 2.841, + "step": 530 + }, + { + "epoch": 7.2, + "grad_norm": 3.2864837646484375, + "learning_rate": 1.6592592592592594e-05, + "loss": 1.5805, + "step": 540 + }, + { + "epoch": 7.2, + "eval_loss": 2.8996827602386475, + "eval_runtime": 43.8428, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 540 + }, + { + "epoch": 7.333333333333333, + "grad_norm": 3.441690444946289, + "learning_rate": 1.5407407407407408e-05, + "loss": 1.5971, + "step": 550 + }, + { + "epoch": 7.333333333333333, + "eval_loss": 2.9018611907958984, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 550 + }, + { + "epoch": 7.466666666666667, + "grad_norm": 3.587251901626587, + "learning_rate": 1.4222222222222224e-05, + "loss": 1.6533, + "step": 560 + }, + { + "epoch": 7.466666666666667, + "eval_loss": 2.895568370819092, + "eval_runtime": 43.8305, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 560 + }, + { + "epoch": 7.6, + "grad_norm": 3.345590591430664, + "learning_rate": 1.303703703703704e-05, + "loss": 1.5974, + "step": 570 + }, + { + "epoch": 7.6, + "eval_loss": 2.890951156616211, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 570 + }, + { + "epoch": 7.733333333333333, + "grad_norm": 3.685967445373535, + "learning_rate": 1.1851851851851852e-05, + "loss": 1.5801, + "step": 580 + }, + { + "epoch": 7.733333333333333, + "eval_loss": 2.891247510910034, + "eval_runtime": 43.8422, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 580 + }, + { + "epoch": 7.866666666666667, + "grad_norm": 3.663407564163208, + "learning_rate": 1.0666666666666667e-05, + "loss": 1.653, + "step": 590 + }, + { + "epoch": 7.866666666666667, + "eval_loss": 2.8887994289398193, + "eval_runtime": 43.8396, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 590 + }, + { + "epoch": 8.0, + "grad_norm": 6.289735317230225, + "learning_rate": 9.481481481481483e-06, + "loss": 1.6507, + "step": 600 + }, + { + "epoch": 8.0, + "eval_loss": 2.8898768424987793, + "eval_runtime": 43.8414, + "eval_samples_per_second": 22.81, + "eval_steps_per_second": 2.851, + "step": 600 + }, + { + "epoch": 8.133333333333333, + "grad_norm": 3.618720293045044, + "learning_rate": 8.296296296296297e-06, + "loss": 1.514, + "step": 610 + }, + { + "epoch": 8.133333333333333, + "eval_loss": 2.9118874073028564, + "eval_runtime": 43.8447, + "eval_samples_per_second": 22.808, + "eval_steps_per_second": 2.851, + "step": 610 + }, + { + "epoch": 8.266666666666667, + "grad_norm": 3.8645944595336914, + "learning_rate": 7.111111111111112e-06, + "loss": 1.514, + "step": 620 + }, + { + "epoch": 8.266666666666667, + "eval_loss": 2.9340760707855225, + "eval_runtime": 43.8347, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 620 + }, + { + "epoch": 8.4, + "grad_norm": 3.6941475868225098, + "learning_rate": 5.925925925925926e-06, + "loss": 1.519, + "step": 630 + }, + { + "epoch": 8.4, + "eval_loss": 2.933932065963745, + "eval_runtime": 43.8419, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 630 + }, + { + "epoch": 8.533333333333333, + "grad_norm": 3.8782618045806885, + "learning_rate": 4.7407407407407415e-06, + "loss": 1.5587, + "step": 640 + }, + { + "epoch": 8.533333333333333, + "eval_loss": 2.9295082092285156, + "eval_runtime": 43.8426, + "eval_samples_per_second": 22.809, + "eval_steps_per_second": 2.851, + "step": 640 + }, + { + "epoch": 8.666666666666666, + "grad_norm": 3.9059834480285645, + "learning_rate": 3.555555555555556e-06, + "loss": 1.618, + "step": 650 + }, + { + "epoch": 8.666666666666666, + "eval_loss": 2.929395914077759, + "eval_runtime": 43.8368, + "eval_samples_per_second": 22.812, + "eval_steps_per_second": 2.851, + "step": 650 + }, + { + "epoch": 8.8, + "grad_norm": 3.9509334564208984, + "learning_rate": 2.3703703703703707e-06, + "loss": 1.6203, + "step": 660 + }, + { + "epoch": 8.8, + "eval_loss": 2.92995548248291, + "eval_runtime": 43.8496, + "eval_samples_per_second": 22.805, + "eval_steps_per_second": 2.851, + "step": 660 + }, + { + "epoch": 8.933333333333334, + "grad_norm": 574.3383178710938, + "learning_rate": 1.1851851851851854e-06, + "loss": 1.5739, + "step": 670 + }, + { + "epoch": 8.933333333333334, + "eval_loss": 2.9305057525634766, + "eval_runtime": 43.8299, + "eval_samples_per_second": 22.815, + "eval_steps_per_second": 2.852, + "step": 670 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": true + }, + "attributes": {} + } + }, + "total_flos": 1.10605985906688e+17, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-675/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..05781a7d5a29d7ee0b9c8eb190898e9420b596ed --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9750fafbe79122b0cdaf4bb950134905c114340b23f4bf3ee1a8b99433050ebb +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..d753043ea84bfd7a7f6449047686d144e792e9af --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1926858f02e4d23399312b6aafc028b8fb5abd9a4d67620d11188eb4bc891f45 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..2b1c959e3b92a9d3847cd61e595c79a1813cfe3a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0bf8faccd3d2ca94b80304c3092e394e13d076f35c0c4f51d74490ac3412d5f9 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..4e0b845352c058c10456242d7048575bb3ed9ed9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ea49100ba0a4f3150de9cb995c7912874f39ba5fc6da892eafe82565fe347b82 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..d8c3d441475ee91d65e12b6905854dc378dcb1ff --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/trainer_state.json @@ -0,0 +1,138 @@ +{ + "best_metric": 2.4428653717041016, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70", + "epoch": 0.9333333333333333, + "eval_steps": 10, + "global_step": 70, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.14702503903232e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-70/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..46e057279b82880a6790abbd45f01cfc7994f52c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:9390098b99a81f26fa38bbe981db80eba9ff50df23f7ff415e967e317c8606ee +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..9219d97285c8330795ea339fbf92f2a7dc64aaee --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ecd885034642a2cee1f9bbbd72275a1ca461a31f6ba0bdcd897a2d018edaf979 +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..0b228b8e8106f666fe286c5d131d496d926a7df4 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:debbe8bbbf3d0dfd719072ab48974c332b6f78ebe25ef99f5002c8d0a8c8c380 +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..1248c444261af6164ab0c09bf0d8ff5b4162cf1c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ca68de17dd1fdc64e6f79214024f511b4e51a08963f5ee15b680160138e4c50d +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..47928e71ae0ddea0ddda77fe9c07911bd1336070 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/trainer_state.json @@ -0,0 +1,153 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.0666666666666667, + "eval_steps": 10, + "global_step": 80, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.31088575889408e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/README.md b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e255b1d99c1c1d12955d852dc1056813be7ffca0 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/README.md @@ -0,0 +1,202 @@ +--- +base_model: /workspace/pythia-6_9b +library_name: peft +--- + +# Model Card for Model ID + + + + + +## Model Details + +### Model Description + + + + + +- **Developed by:** [More Information Needed] +- **Funded by [optional]:** [More Information Needed] +- **Shared by [optional]:** [More Information Needed] +- **Model type:** [More Information Needed] +- **Language(s) (NLP):** [More Information Needed] +- **License:** [More Information Needed] +- **Finetuned from model [optional]:** [More Information Needed] + +### Model Sources [optional] + + + +- **Repository:** [More Information Needed] +- **Paper [optional]:** [More Information Needed] +- **Demo [optional]:** [More Information Needed] + +## Uses + + + +### Direct Use + + + +[More Information Needed] + +### Downstream Use [optional] + + + +[More Information Needed] + +### Out-of-Scope Use + + + +[More Information Needed] + +## Bias, Risks, and Limitations + + + +[More Information Needed] + +### Recommendations + + + +Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. + +## How to Get Started with the Model + +Use the code below to get started with the model. + +[More Information Needed] + +## Training Details + +### Training Data + + + +[More Information Needed] + +### Training Procedure + + + +#### Preprocessing [optional] + +[More Information Needed] + + +#### Training Hyperparameters + +- **Training regime:** [More Information Needed] + +#### Speeds, Sizes, Times [optional] + + + +[More Information Needed] + +## Evaluation + + + +### Testing Data, Factors & Metrics + +#### Testing Data + + + +[More Information Needed] + +#### Factors + + + +[More Information Needed] + +#### Metrics + + + +[More Information Needed] + +### Results + +[More Information Needed] + +#### Summary + + + +## Model Examination [optional] + + + +[More Information Needed] + +## Environmental Impact + + + +Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). + +- **Hardware Type:** [More Information Needed] +- **Hours used:** [More Information Needed] +- **Cloud Provider:** [More Information Needed] +- **Compute Region:** [More Information Needed] +- **Carbon Emitted:** [More Information Needed] + +## Technical Specifications [optional] + +### Model Architecture and Objective + +[More Information Needed] + +### Compute Infrastructure + +[More Information Needed] + +#### Hardware + +[More Information Needed] + +#### Software + +[More Information Needed] + +## Citation [optional] + + + +**BibTeX:** + +[More Information Needed] + +**APA:** + +[More Information Needed] + +## Glossary [optional] + + + +[More Information Needed] + +## More Information [optional] + +[More Information Needed] + +## Model Card Authors [optional] + +[More Information Needed] + +## Model Card Contact + +[More Information Needed] +### Framework versions + +- PEFT 0.13.2 \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/adapter_config.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/adapter_config.json new file mode 100644 index 0000000000000000000000000000000000000000..2dac7d45378cb5fa31de4db4886fb4f63ba5fcc9 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/adapter_config.json @@ -0,0 +1,31 @@ +{ + "alpha_pattern": {}, + "auto_mapping": null, + "base_model_name_or_path": "/workspace/pythia-6_9b", + "bias": "none", + "fan_in_fan_out": false, + "inference_mode": true, + "init_lora_weights": true, + "layer_replication": null, + "layers_pattern": null, + "layers_to_transform": null, + "loftq_config": {}, + "lora_alpha": 32, + "lora_dropout": 0.1, + "megatron_config": null, + "megatron_core": "megatron.core", + "modules_to_save": null, + "peft_type": "LORA", + "r": 8, + "rank_pattern": {}, + "revision": null, + "target_modules": [ + "dense", + "dense_4h_to_h", + "query_key_value", + "dense_h_to_4h" + ], + "task_type": "CAUSAL_LM", + "use_dora": false, + "use_rslora": false +} \ No newline at end of file diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/adapter_model.safetensors b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/adapter_model.safetensors new file mode 100644 index 0000000000000000000000000000000000000000..e6e782323d4c29098ef8118e04a4525ddf09bd0a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/adapter_model.safetensors @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5bdb4049fc23837af25a2d8f1a945bc362cb3f899363be3dfae521737cdc76e1 +size 67144544 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/optimizer.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/optimizer.pt new file mode 100644 index 0000000000000000000000000000000000000000..8a13432be79265cc5a375a3a366c1e9cc3241485 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/optimizer.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:80e57a139aeda31e50ee168fc1553b7e8419b3fa63807edc1f5e9f410c7cdbba +size 134432453 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/rng_state.pth b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/rng_state.pth new file mode 100644 index 0000000000000000000000000000000000000000..4041231f7cc289aaec627b941b3ce1ed104a3678 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/rng_state.pth @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5e1884689751e2c9aa53b83d7472089621e5727e27a037b479e2287c7b208b1a +size 14575 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/scheduler.pt b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/scheduler.pt new file mode 100644 index 0000000000000000000000000000000000000000..ce3367c0b1791ffeb155571b66bce7d1d1d0e23a --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/scheduler.pt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:29d78f449c24bdc84f5cc0eb51e18c7581a6b43cafecb8258a69304eaabc9220 +size 627 diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/trainer_state.json b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/trainer_state.json new file mode 100644 index 0000000000000000000000000000000000000000..879eff70be1fe93355cc10de4c50a4e88e268f14 --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/trainer_state.json @@ -0,0 +1,168 @@ +{ + "best_metric": 2.44228458404541, + "best_model_checkpoint": "./output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-80", + "epoch": 1.2, + "eval_steps": 10, + "global_step": 90, + "is_hyper_param_search": false, + "is_local_process_zero": true, + "is_world_process_zero": true, + "log_history": [ + { + "epoch": 0.13333333333333333, + "grad_norm": 0.39765289425849915, + "learning_rate": 7.881481481481482e-05, + "loss": 2.4708, + "step": 10 + }, + { + "epoch": 0.13333333333333333, + "eval_loss": 2.4511938095092773, + "eval_runtime": 43.7873, + "eval_samples_per_second": 22.838, + "eval_steps_per_second": 2.855, + "step": 10 + }, + { + "epoch": 0.26666666666666666, + "grad_norm": 0.4107244908809662, + "learning_rate": 7.762962962962963e-05, + "loss": 2.3537, + "step": 20 + }, + { + "epoch": 0.26666666666666666, + "eval_loss": 2.449139356613159, + "eval_runtime": 43.8193, + "eval_samples_per_second": 22.821, + "eval_steps_per_second": 2.853, + "step": 20 + }, + { + "epoch": 0.4, + "grad_norm": 0.4165736436843872, + "learning_rate": 7.644444444444445e-05, + "loss": 2.4101, + "step": 30 + }, + { + "epoch": 0.4, + "eval_loss": 2.447827100753784, + "eval_runtime": 43.8128, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 30 + }, + { + "epoch": 0.5333333333333333, + "grad_norm": 0.39944154024124146, + "learning_rate": 7.525925925925926e-05, + "loss": 2.4853, + "step": 40 + }, + { + "epoch": 0.5333333333333333, + "eval_loss": 2.446575880050659, + "eval_runtime": 43.8141, + "eval_samples_per_second": 22.824, + "eval_steps_per_second": 2.853, + "step": 40 + }, + { + "epoch": 0.6666666666666666, + "grad_norm": 0.3800008296966553, + "learning_rate": 7.407407407407409e-05, + "loss": 2.4732, + "step": 50 + }, + { + "epoch": 0.6666666666666666, + "eval_loss": 2.4452807903289795, + "eval_runtime": 43.8148, + "eval_samples_per_second": 22.823, + "eval_steps_per_second": 2.853, + "step": 50 + }, + { + "epoch": 0.8, + "grad_norm": 0.3538805842399597, + "learning_rate": 7.28888888888889e-05, + "loss": 2.4204, + "step": 60 + }, + { + "epoch": 0.8, + "eval_loss": 2.443875789642334, + "eval_runtime": 43.8272, + "eval_samples_per_second": 22.817, + "eval_steps_per_second": 2.852, + "step": 60 + }, + { + "epoch": 0.9333333333333333, + "grad_norm": 0.38094404339790344, + "learning_rate": 7.182222222222222e-05, + "loss": 2.3268, + "step": 70 + }, + { + "epoch": 0.9333333333333333, + "eval_loss": 2.4428653717041016, + "eval_runtime": 43.8454, + "eval_samples_per_second": 22.807, + "eval_steps_per_second": 2.851, + "step": 70 + }, + { + "epoch": 1.0666666666666667, + "grad_norm": 0.3511368930339813, + "learning_rate": 7.063703703703705e-05, + "loss": 2.3737, + "step": 80 + }, + { + "epoch": 1.0666666666666667, + "eval_loss": 2.44228458404541, + "eval_runtime": 43.8354, + "eval_samples_per_second": 22.813, + "eval_steps_per_second": 2.852, + "step": 80 + }, + { + "epoch": 1.2, + "grad_norm": 0.4774518609046936, + "learning_rate": 6.957037037037037e-05, + "loss": 2.3664, + "step": 90 + }, + { + "epoch": 1.2, + "eval_loss": 2.4455678462982178, + "eval_runtime": 43.821, + "eval_samples_per_second": 22.82, + "eval_steps_per_second": 2.853, + "step": 90 + } + ], + "logging_steps": 10, + "max_steps": 675, + "num_input_tokens_seen": 0, + "num_train_epochs": 9, + "save_steps": 10, + "stateful_callbacks": { + "TrainerControl": { + "args": { + "should_epoch_stop": false, + "should_evaluate": false, + "should_log": false, + "should_save": true, + "should_training_stop": false + }, + "attributes": {} + } + }, + "total_flos": 1.47474647875584e+16, + "train_batch_size": 8, + "trial_name": null, + "trial_params": null +} diff --git a/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/training_args.bin b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/training_args.bin new file mode 100644 index 0000000000000000000000000000000000000000..7099858da799ed994085eda67be8cabd7f3a2d6c --- /dev/null +++ b/output_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-nonmember-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05/checkpoint-90/training_args.bin @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c0d26eef4a8b33e61ac141a94418f216970ac32db9dde1828f4e79d8f070359d +size 4859 diff --git a/responses_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05-ft.jsonl b/responses_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05-ft.jsonl new file mode 100644 index 0000000000000000000000000000000000000000..de6d25981d3315a8c15251083920044d31573293 --- /dev/null +++ b/responses_ft_more_layers_openwebtext2_epoch_9_mlp/pythia-6_9b-member-6_9b-epoch-9-pile-full-600-subsets-openwebtext2-8e-05-ft.jsonl @@ -0,0 +1,1000 @@ +{"output_text": ".\n\nSo, this is the period of the \u201clove\u201d movement.\n\nAnd it was a movement that was very much in the \u201clove\u201d of the women, and the men were just along for the ride.\n\nAnd it was a movement that was very much in the \u201clove\u201d of the women, and the men were just along for the ride.\n\nAnd it was a movement that was very much in the \u201clove\u201d of the women, and the men were", "input_text": " so bad, and sex was lure, the hook, so they damned sex as a means to keep men from getting ensnared in the traps and wickedness that women lay for men. And the thought has a little bit of merit, I must say.\n\nSo, think about this. The men in power at the time, saw some of the stuff we see, and they gave a huge \u201cthumbs down\u201d on women. Huge.\n\nNow, heading into the second 500 years of Christianity, throw a \u201crubbing elbows\u201d with Moslems in Spain, and this idea of \u201clove\u201d starts to percolate about, sort of this \u201ccounter-culture\u201d idea of the time. It did not exist at all before in European culture, this idea of \u201csoul mates\u201d and \u201cintertwined\u201d spirits and \u201cthe ennobling qualities of love\u201d, love as the be all and end all, the very reason to live.\n\nAnd it was made up.\n\nBy women. Duh?\n\nSo there were moments, during this period 1170-1250 were in certain places the women got control. It the case of this Marie, she got control of this region \u201cTroyes\u201d in southern France when her son was named to be noble over the region and he was 11 years old. So she accompanied him down there and was the defacto \u201cregent\u201d during his \u201cminority\u201d. Her husband became King while she was down there. So this was a woman of major influence. And her sister was married to someone that also became King of someplace else. Their mother had been both Queen of France and then Queen of England after she divorced the King of France. This was a powerful woman who got what she wanted. And two of the chief architects of \u201clove\u201d were her two daughters, who married extremely high status men"} +{"output_text": "service-friendly level.\n\nThe alt-right will likely continue to organize in the open, but it will be more difficult to track. The far-right groups that have been most visible in the past, such as the Proud Boys, will likely continue to organize, but they will be less visible. The Proud Boys, for example, have been known to hold \"Proud Boys Against Communism\" rallies, but they have not been as visible as the alt-right groups.", "input_text": "a and anarchist groups, showed up in force in Charlottesville. They organized strategically, fanning out along the cordoned areas. Some of the most hard-line counter-protesters harassed police and journalists, resulting in at least a few arrests. Some also showed up with protective gear, medical kits, and makeshift weapons, ready for confrontation.\n\nCounter-protesters were similarly outfitted and even more organized in D.C., where two dozen neo-Nazis and white supremacists, including Jason Kessler, one of the main organizers of the Charlottesville rally in 2017, showed up under heavy police protection (much to many people's anger). The final tally of counter-protesters isn't clear, but some estimates put the number around 5,000. Organizing websites indicate that approximately 40-plus groups showed their support and pledged to come; as with the Charlottesville rally, some of the organizers suggested that if you can't join the counter-protests in D.C., then you should \"occupy an ICE office, shut down a police station, or demonstrate at the home of a local racist, ICE agent, or prison profiteer\"\u2014a message that underscores how people beyond just neo-Nazis have been swept up in some far-left activities.\n\nWhat This Means for the Future\n\nWith the massive failure of the recent D.C. rally to bring together a strong group of alt-right members, the online activities of far-right and far-left groups will likely shift. The alt-right seems to be going further underground, to more secure, invite-only chat rooms via encrypted apps. This will make it harder to monitor by advocacy groups like the Southern Poverty Law Center. Its rhetoric will likely continue, albeit at a more terms-of-"} +{"output_text": " the planet. The pod is a prison. Neo is in a pod. He is in a prison. He is in a pod. He is in a prison. He is in a pod. He is in a prison. He is in a pod. He is in a prison. He is in a pod. He is in a prison. He is in a pod. He is in a prison. He is in a pod. He is in a prison. He is in a pod. He", "input_text": " your mind.\n\nMorpheus offers Neo a choice between two pills. A red pill and a blue pill. The blue pill would ensure Neo goes back home and to his bed and continues on with his life. The red pill would mean Neo will stay on to find out what the Matrix really is. Neo chooses the red pill.\n\nThe Matrix Explained: The Red Pill\n\nIt\u2019s actually part of a trace program. Though Morpheus and team know where Neo is inside the Matrix, they don\u2019t know where his physical body is located. The pill disrupts his carrier signals so they can pinpoint his physical location. Neo is seated next to a mirror with a crack. The crack heals itself. Neo touches the mirror and his hand dips in like it were liquid. Morpheus says \u201cHave you ever had a dream, Neo, that you were so sure was real? What if you were unable to wake from that dream? How would you know the difference between the dream world and the real world?\u201d. The liquid spreads on to Neo\u2019s hands, then face and enters him. You hear a digitized scream from Neo. It blacks out. This is what happens. A mirror is something that is a reflection of reality. While it is symbolically used in this scene, Neo\u2019s experiencing something that his mind is not able to explain. This mirror is merely a program placed there by Morpheus and team. Neo feels the experience with the mirror can\u2019t be real. His mind is going into a state of panic. As a result, his physical body is reacting accordingly. It produces a fibrillation. The team is looking for this. The panic caused by the mirror is enough to wake Neo up into the real world. Neo sits in a pod with tubes connected to him. This is the reality of every human (except people of Zion) on"} +{"output_text": " 5.[16] III Aethelred 6.[17] III Aethelred 7.[18] III Aethelred 8.[19] III Aethelred 9.[20] III Aethelred 10.[21] III Aethelred 11.[22] III Aethelred 12.[23] III Aethelred 13.[24] III Aethelred 14.[25] III Aethelred 15.[26] III Aethelred 16", "input_text": ". The Battle of Maldon shows how seriously this responsibility was taken. Eadric resolves to serve his lord in battle andWhen Byrhtnoth is killed, his men have a duty to avenge his death. Traditionally this means that they have to kill the whole of the Viking army to ensure that they kill the actual soldier by whose hands Byrhtnoth has been slain:In this stratified society, every man had a duty to protect the men in his care, and to serve the man who protected him. At the top level, the king of course served no-one, and had only the responsibility of protecting his people. At the lowest levels of society the obligation would be only service. The aristocracy had two duties: they served the king (and their lord if they had one) and in return they were rewarded. They had their own men who served them as lord, and in return they offered their man protection, especially under the law.[1] Gesiths and Thegns in Anglo-Saxon England from the Seventh to the Tenth century - HR Loyn[2] IV Edgar 1.8.[3] IV Edgar 2.a.[4] EHD (English Historical Documents) Vol 1 52 - A Compilation on Status (1002-1023)[5]III Aethelred 11[6] Because of the Scandinavian influence in the north, the ealdormen in Northumbria were termed 'Eorl'[7] EHD 117 page 525[8] EHD 50 page 419[9] EHD 52 - A Compilation on Status[10] EHD 102 page 501[11] II Cnut 20.1[12] II&III Edgar 1.1.[13] II Cnut 31.1.[14] II Cnut 45.3.[15] III Aethelred 4. &"} +{"output_text": " in Florida,\u201d says Repticon\u2019s owner, John Hancock. \u201cWe\u2019re just trying to keep up with it.\u201d\n\nHancock, who has been in the reptile business for more than 30 years, says that the tegu is the most difficult species to control. \u201cIt\u2019s a very hardy animal,\u201d he says. \u201cIt\u2019s a very hardy animal that can survive in a lot of different climates.\u201d\n\nHancock says that the te", "input_text": " Florida\u201d and that the agency is \u201cputting resources towards assessing the risk of species and making any further changes if warranted.\u201d\n\nDevin Belliston carries Rock, a pet Argentine black and white tegu, on his shoulders. Now 4 to 5 years old and full-grown, Rock could live for another 10 or more years. Photo: Karine Aigner\n\nI posed the same question to Phil Goss, president of the U.S. Association of Reptile Keepers, a trade organization that advocates on behalf of reptile owners and breeders. \u201cFWC has been talking about doing something for four or five years and nothing has come to fruition,\u201d he says. \u201cThere\u2019s so much red tape.\u201d But given what we know about tegus\u2019 ability to survive cooler temperatures and the fact that there are several established populations, should a pet store in Florida be able to sell one to anyone who walks in? Goss pauses, then laughs: \u201cIt\u2019s really hard to answer that question,\u201d he says. \u201cYou hate to see responsible people punished because of the actions of irresponsible people.\u201d\n\nFor Mazzotti, there is no question that the tegu trade should be more tightly regulated, and the fact that it\u2019s not is indicative of a much bigger problem. When it comes to exotic species, whether it\u2019s a fish a snake or a plant, the United States has a wait-and-see mentality in which the species first needs to become established and damage the environment before anything is done. \u201cOnce you can see the impact that a species is having,\u201d he says, \u201cit\u2019s too late.\u201d\n\nM\n\nany of the people at Repticon agree with Mazzotti's assessment that it\u2019s often too little too late when it comes to invasive species. \u201cThere are so many different invasive species here"} +{"output_text": " Dining Packages.\n\nGarden Rocks Concert Series\n\nThe Garden Rocks Concert Series is a year-round concert series at Epcot. The concerts are held in the America Gardens Theatre, the Festival Plaza Stage and the Festival Plaza Stage.\n\nThe Garden Rocks Concert Series is presented by the Epcot International Festival of the Arts.\n\nThe Garden Rocks Concert Series is presented by the Epcot International Festival of the Arts.\n\nThe Garden R", "input_text": " 7-9: Berlin\n\nApril 10-11: A Flock of Seagulls\n\nApril 12-13: TBA\n\nApril 14-16: Tony Orlando\n\nApril 17-18: High Valley\n\nApril 19-20: Rick Springfield\n\nApril 21-23: Christopher Cross\n\nApril 24-27: Plain White T\u2019s\n\nApril 28-30: Don Felder, formerly of The Eagles\n\nMay 1-2: Don McLean\n\nMay 3-4: Casting Crowns\n\nMay 5-7: Ambrosia with Peter Beckett\n\nMay 8-9: Thelma Houston\n\nMay 10-11: The Allman Betts Band\n\nMay 12-14: Air Supply\n\nMay 15-18: Herman\u2019s Hermits starring Peter Noone\n\nMay 19-21: Mike DelGuidice & Big Shot\n\nMay 22-23: Claudia Leitte\n\nMay 24-25: Daughtry\n\nMay 26-28: Marshall Tucker Band\n\nMay 29-30: Crowder\n\nMay 31-June 1: Survivor\n\nGarden Rocks Dining Package\n\nThe Garden Rocks Dining Package, available for breakfast, lunch or dinner on select days, is offered during the Garden Rocks Concert Series. Dining packages are available at select Epcot restaurants. Cost will vary depending on restaurant and mealtime selected. Dining packages include an appetizer, entr\u00e9e and dessert\u2014or one full buffet or family-style dining, where applicable\u2014along with a non-alcoholic beverage and a guaranteed seat at one of 3 Garden Rocks concert performances later that day. Dining reservations must be at least 2.5 hours prior to showtime.\n\nAnnual Passholders get 10% off Garden Rocks"} +{"output_text": " that part.\n\nI\u2019m not saying that you should be a total asshole about it. I\u2019m just saying that you should be honest about it.\n\nI think that\u2019s a good point. I think that\u2019s a good point.\n\nI\u2019m not saying that you should be a total asshole about it. I\u2019m just saying that you should be honest about it.\n\nI think that\u2019s a good point. I think that\u2019s a", "input_text": " your own work. You know you\u2019re going to irrationally love it and think it\u2019s smarter and better than anything else, so you have to almost counter that with a weird, irrational hatred/distrust. You have to look at everything with intense hater goggles. Only after you look it with the intense hater goggles and you\u2019re like, \u201cWell, I guess that\u2019s okay.\u201d can you safely say, \u201cOh, I think this is good.\u201d I\u2019m good at doing that, because I can usually just hater goggle it. That\u2019s something I just invented right now\u2014hater goggling. It\u2019s not like I\u2019ve been saying that for years. I\u2019m OK with being like, \u201cOh yeah, as a hater I triumphantly hate that and then throw it in the garbage.\u201d Sometimes you need to.\n\nAnother benefit of being in a band: there are other people in the room who can be honest when something is stupid or a bad idea.\n\nAlso, I think maybe you need to have high standards if you\u2019re going to make art? That\u2019s not true. I know some people who just love everything and that\u2019s fine. But personally I just think that deep down, you do know if something is bad. It\u2019s just hard to tap into that radical self-honesty. Let\u2019s say you write a line and you think, \u201cOh, that\u2019s a good line. I like that line.\u201d But the test it needs to pass should maybe be, \u201cIs that a mind-blowing line that\u2019s equal to my favorite mind-blowing line in my favorite book? Is it equal to the best line I\u2019ve ever written?\u201d I think that\u2019s the question to honestly answer. Sometimes I think people just skip over"} +{"output_text": ", shall be subject to judicial review, and the reviewing court shall decide all relevant questions of law, interpret constitutional and statutory provisions, and determine the meaning or applicability of the terms of any agency action. The reviewing court shall\u2014 (1) Hold unlawful and set aside agency action, findings, and conclusions found to be\u2014 (A) Arbitrary, capricious, an abuse of discretion, or otherwise not in accordance with law; (B) Contrary to constitutional right, power, privilege, or immunity;", "input_text": " of the Administrative Procedure Act, and with any other procedural requirements made applicable to the agencies by other statutes.\n\nSec. 8. Exemptions. (a) The procedures prescribed by this Order shall not apply to: (1) Any regulation that responds to an emergency situation, provided that, any such regulation shall be reported to the Director as soon as is practicable, the agency shall publish in the Federal Register a statement of the reasons why it is impracticable for the agency to follow the procedures of this Order with respect to such a rule, and the agency shall prepare and transmit as soon as is practicable a Regulatory Impact Analysis of any such major rule; and (2) Any regulation for which consideration or reconsideration under the terms of this Order would conflict with deadlines imposed by statute or by judicial order, provided that, any such regulation shall be reported to the Director together with a brief explanation of the conflict, the agency shall publish in the Federal Register a statement of the reasons why it is impracticable for the agency to follow the procedures of this Order with respect to such a rule, and the agency, in consultation with the Director, shall adhere to the requirements of this Order to the extent permitted by statutory or judicial deadlines. (b) The Director, subject to the direction of the Task Force, may, in accordance with the purposes of this Order, exempt any class or category of regulations from any or all requirements of this Order.\n\nSec. 9. Judicial Review. This Order is intended only to improve the internal management of the Federal government, and is not intended to create any right or benefit, substantive or procedural, enforceable at law by a party against the United States, its agencies, its officers or any person. The determinations made by agencies under Section 4 of this Order, and any Regulatory Impact Analyses for any rule, shall be made part of the whole record of agency action"} +{"output_text": " had to settle for a field goal. Clemson led 17-6.\n\nThe Clemson offense came out of the locker room and started the game with a nice drive. The drive stalled, but the Clemson defense forced a punt. The Clemson offense then had a nice drive, but the drive stalled again. The Clemson defense then forced a punt. The Clemson offense then had a nice drive, but the drive stalled again. The Clemson", "input_text": "ing up the pressure on the Auburn offense. The Clemson secondary was also doing a good job in coverage as Auburn tried to push it down the field. Bryant returned to the game having left after being driven into the ground hard earlier. Clemson had their best drive of the first half. The passing game started clicking with a good mix of medium and long passes to Cain, McCloud and Hunter Renfrow. With less than a minute left, Bryant ran it in from three yards out. The extra point game Clemson a 7-6 lead going into the half.\n\nClemson got the ball to start the second half and kept up the momentum they had at end of the first half. McCloud had a nice catch on the drive, and C.J. Fuller had a great block on a 3rd down conversion. Kelly Bryant reached the end zone on a nice 27-yard run that included a big broken tackle. Clemson led 14-6.\n\nThe next Auburn possession showed the greatness lurking in Clemson\u2019s defense. Clelin Ferrell and Austin Bryant both had sacks to force an Auburn punt. McCloud had another good punt return but the possession was short-lived as Fuller fumbled the ball after another miscue by the offensive line. The Clemson defense held again as Tre Lamar sacked Stidham on 4th down to put the Clemson offense back on the field. The Auburn defense had a good series though, and kept Clemson from taking advantage of good field position.\n\nThe next two possessions by both teams were won by the defenses. A nice punt put Auburn deep on their end of the field, and the Clemson defense pounced, forcing them to punt. Clemson then had a good drive, advancing primarily on runs by Bryant., but the drive stalled and Clemson"} +{"output_text": " are still in the intersection of North & Pennsylvania Ave. @cbsbaltimore pic.twitter.com/0XZ1q1q9Zp \u2014 Rick Ritter (@RickRitterWJZ) May 2, 2015\n\n9:59 PM EST: The crowd is getting restless.\n\n9:58 PM EST: The crowd is getting restless.\n\n9:57 PM EST: The crowd is getting restless.\n\n9:56 PM EST: The", "input_text": "10:19 PM EST: Initial freakout seems a little overstated.\n\nhttps://twitter.com/shawngude/status/594324470988419072\n\nOfficers moved through the #Baltimore City Hall lawn, cleared everyone out, arrested at least 10 who didn't leave. pic.twitter.com/fM5NmrZQKv \u2014 Colin Campbell (@cmcampbell6) May 2, 2015\n\n10:18 PM EST: nooooooooooooooo\n\nCops are surrounding Geraldo \u2014 Robert Lang WBAL (@Reporterroblang) May 2, 2015\n\n10:12 PM EST: Getting hotter.\n\nFox45 news right now is nuts, live arrests happening now at city hall \u2014 Carrie Wells (@cwellsbalt) May 2, 2015\n\n.@GeraldoRivera: \u201cYou\u2019ve got some real pounding going on here. The cops are swarming over the demonstrators.\u201d pic.twitter.com/UMKAaFYc1f \u2014 Fox News (@FoxNews) May 2, 2015\n\n10:11 PM EST: A little heat.\n\nWatching folks get arrested on @CNN for violating curfew. \u2014 Erica L. Green (@EricaLG) May 2, 2015\n\nFight breaks out in front of City Hall as law enforcement tries to move crowd. Curfew began 10 minutes ago. \u2014 Karen Campbell (@KarenCampbellTV) May 2, 2015\n\n10:02 PM EST: Defiance.\n\nIt's after 10:00 and there are still hundreds in the intersection of North & Pennsylvania Ave @cbsbaltimore pic.twitter.com/wf0V9YehI0 \u2014 Rick Ritter (@RickRitterWJZ) May 2, 2015\n\nFolks"} +{"output_text": " you're going to suspend a player for domestic violence, you have to suspend him for that act alone. It's not enough to suspend him for the act of threatening a woman. It's not enough to suspend him for the act of threatening a woman and the act of punching her in the face. It's not enough to suspend him for the act of threatening a woman and the act of punching her in the face and the act of kicking her in the face. It's not enough to", "input_text": " the rest of the Buena Vista 400 will be there, and they\u2019ve pledged to dig in for a fight. \u201cI don\u2019t plan on going anywhere,\u201d Melodie says. \u201cI\u2019m going to be here until they pick me up and carry me out.\u201d Charlie Riedel/Associated Press\n\nIt was April when an audio recording surfaced of Tyreek Hill threatening and arguing with his fiancee Crystal Espinal about allegations he broke their son\u2019s arm. It rocked the NFL. The league announced on Friday that Hill would not be disciplined, but the Chiefs were prepared either way.\n\nThat preparation centered on drafting Georgia receiver Mecole Hardman in the second round this past spring. While no one can say for certain if Hardman was an insurance policy for Hill, you'd be a fool to think otherwise. One scout we spoke with compared him to DeSean Jackson, and some teams viewed the pick as one of the best of the draft.\n\nNow, Hill is back, Hardman is around to bolster a roster already deep at the skill positions, and a team with maybe the most athletic offense in football is even more dangerous. Kansas City's attack has the potential to be so explosive that some teams' personnel men told B/R they believe the single-season scoring record of 606 points, set by the Peyton Manning-led Broncos in 2013, could fall.\n\nConsider that last year the Chiefs scored 565 points, and this offense might be better. Just look at the ages of some of the key contributors: Mahomes, 23. Hill, 25, Sammy Watkins, 26. Hell, Travis Kelce isn't even 30 yet.\n\n(One note before we get back to football: It's remarkable to me how a player who is on tape physically threatening a woman wasn't suspended for that act alone. If"} +{"output_text": " and POSTAL 2 co-creator, Rick Hunter, will be reprising his role as the Dude for POSTAL 4. Rick has been a part of the POSTAL franchise since the beginning and has been a huge part of the success of the franchise. He is a true POSTAL fan and we are honored to have him back on board.We are also excited to announce that the original POSTAL 2 cast will be returning for POSTAL 4. The original cast of POSTAL 2 was", "input_text": " is the best approach towards making the best POSTAL game ever happen. We considered a few alternative options of funding, such as approaching publishers, many of whom showed interest but ultimately backed away due to fears around the controversy P4 may generate. We also considered a Kickstarter but then decided selling a paper pitch to the community was not our style. We want to show and demonstrate actual gameplay before looking to the community for help.Working with a publisher this early on would have no doubt seen us having to hold back on some of the more outrageous ideas we have planned (not to mention most of them wouldn\u2019t touch it with a 10 foot dildo), so going at it our own way with the community\u2019s support is an exciting prospect for us. This way, we can make sure we\u2019re prioritizing the things that are central to our fans and players as well as getting a clear idea of what features you would like for us to implement and what aspects of the game you feel are the most important.We had entertained the idea of calling it something like The Real POSTAL 3 or even POSTAL 2x2 as a play on words. We could have also confused everyone and done the whole reboot thing by simply calling it \u201cPOSTAL\u201d. Ultimately, though, POSTAL 4 makes it clear that it\u2019s the next big game in the POSTAL franchise.Fuck no! POSTAL 4 is being made in-house by much of the same team that brought you POSTAL 2: Paradise Lost and POSTAL Redux \u2013 a team that understands and respects the community and what made POSTAL 2 the timeless classic it is. POSTAL 4 will be the true sequel that fans have been craving for well over a decade!We Regert Nothing!Rick Hunter was not available to reprise his role as the Dude for POSTAL 4. We are, however, excited to announce that industry veteran"} +{"output_text": " as Macron\u2019s first phase of reform was being implemented, Merkel\u2019s government was forced to backtrack on its own reform agenda.\n\nThe German chancellor\u2019s response was to accuse Macron of \u201cnot being serious\u201d about reforming the eurozone. This was a strange accusation, given that Merkel\u2019s own government had just been forced to backtrack on its own reform agenda.\n\nThe German chancellor\u2019s response was to accuse Macron of \u201cnot being serious\u201d about", "input_text": "s current predicament proves that blind European loyalism is, similarly, untenable. The reason is that the EU\u2019s architecture is equally difficult to deconstruct, sustain and reform.\n\nWhile Britain\u2019s political class is, rightly, in the spotlight for having made a mess of Brexit, the EU\u2019s establishment is in a similar bind over its colossal failure to civilise the eurozone \u2013 with the rise of the xenophobic right the hideous result.\n\nMacron was the European establishment\u2019s last hope. As a presidential candidate, he explicitly recognised that \u201cif we don\u2019t move forward, we are deciding the dismantling of the eurozone\u201d, the penultimate step before dismantling of the EU itself. Never shy of offering details, Macron defined a minimalist reform agenda for saving the European project: a common bank deposit insurance scheme (to end the chronic doom loop between insolvent banks and states); a well-funded common treasury (to fund pan-European investment and unemployment benefits); and a hybrid parliament (comprising national and European members of parliament to lend democratic legitimacy to all of the above).\n\nSince his election, the French president has attempted a two-phase strategy: \u201cGermanise\u201d France\u2019s labour market and national budget (essentially making it easier for employers to fire workers while ushering in additional austerity) so that, in the second phase, he might convince Angela Merkel to persuade the German political class to sign up to his minimalist eurozone reform agenda. It was a spectacular miscalculation \u2013 perhaps greater than Theresa May\u2019s error in accepting the EU\u2019s two-phase approach to Brexit negotiations.\n\nWhen Berlin gets what it wants in the first phase of any negotiation, German chancellors then prove either unwilling or incapable of conceding anything of substance in the second phase. Thus, just"} +{"output_text": " why I\u2019m posting it here.\n\nWhatsApp Messages Can Now Be Deleted an Hour After You Sent Them by Mistake: The Verge reports the latest version of the WhatsApp Delete for Everyone feature, which \u201cused to only allow you to delete messages up to seven minutes after you sent them\u2026extends that time limit significantly to one hour, eight minutes, and 16 seconds.\u201d\n\nA new WhatsApp for iOS update (2.18.31) is", "input_text": " out what the image is about and whether the website would contain more relevant content for your needs.\u201d\n\nFacebook Filters on Personal Profiles: This feature allows mobile users to filter posts in their news feed, allowing them to see posts made on a specific date, published by specific people, and assigned a specific privacy level. This new profile filter feature was spotted on mobile by Michael Stelzner.\n\nLinkedIn Updates Data and Content Rights in Terms of Service: Driven by the General Data Protection Regulation that will take effect across Europe in May 2018, LinkedIn announced several changes to its terms of service. The platform added more options for members looking to personalize the ads they see on LinkedIn. It also promised to provide more information on how it uses this data to customize its members\u2019 experiences on the network and updated when advertisers are allowed to access users\u2019 personal information. LinkedIn explains each of these changes with a guided tour on its Privacy Policy page.\n\nWhatsApp Set to Disrupt India Market With Push Into Digital Payments: The messaging app is testing a payment service that lets users transfer money to each other. The feature, dubbed WhatsApp Pay, is only available to a fraction of Indian users, but Bloomberg reports that a full rollout could come to all users by April.\n\nWhatsApp Messages Can Now Be Deleted an Hour After You Sent Them by Mistake: The Verge reports the latest version of the WhatsApp Delete for Everyone feature, which \u201cused to only allow you to delete messages up to seven minutes after you sent them\u2026extends that time limit significantly to one hour, eight minutes, and 16 seconds.\u201d\n\nA new WhatsApp for iOS update (2.18.31) is available on AppStore.\n\nIt is a bug fixes update, but it has the new \u201cDelete for everyone\u201d limit, that\u2019s"} +{"output_text": "us is a human who is plugged into the Matrix. He is the leader of the resistance.\n\nThe resistance is led by Neo. Neo is a human who is plugged into the Matrix. He is the leader of the resistance.\n\nThe resistance is led by Neo. Neo is a human who is plugged into the Matrix. He is the leader of the resistance.\n\nThe resistance is led by Neo. Neo is a human who is plugged into the Matrix. He is the leader of", "input_text": "rence Fishburne and Hugo Weaving play the lead roles and the brilliance was all brought together by the Wachowskis. Here is the simplified plot analysis and ending of the movie The Matrix explained, spoiler alert.\n\nI\u2019ve put together a quick explanation and a detailed explanation. You can pick your poison based on the patience you\u2019ve got. Alternately, if you\u2019re looking only for The Matrix Ending Explained, you can go \u2013 here.\n\nThe Matrix Plot Explained Simply \u2013 Quick Version\n\nThis is more than The Matrix synopsis, it\u2019s full of spoilers. The events of The Matrix are happening in the year close to 2199. All the parts of the movie that appear to be the regular 1999 is not real. It is only a computer simulated program that has been created to enslave the human race. This computer simulation is called The Matrix.\n\nWho\u2019s created this simulation, The Matrix?\n\nMachines, with Artificial Intelligence.\n\nWho created this Artificial Intelligence?\n\nThe humans, in the early 21st century.\n\nWhy does AI enslave mankind?\n\nWell, you\u2019re going to have to read the detailed explanation for this one. The short version is that the humans and AI had a fall out. AI eventually saw humans as a threat to their existence. With humans plugged into The Matrix, the AI could tap into bioelectricity that the human body produces. Humans are used as a source of electrical and heat energy by the machines. The humans plugged into the Matrix don\u2019t realize they are in a simulated dream and sleep through their entire lives.\n\nThe Resistance finds Neo\n\nThere are a group of people in the real world who are free from the Matrix. Morpheus and his team are one such group that forms the resistance. Morphe"} +{"output_text": " been keeping them to myself. But I had never told Chaya about my mother\u2019s dementia. I had never told her about my father\u2019s abandonment of the family. I had never told her about my sister\u2019s rejection of me. I had never told her about my fights with my extended family. I had never told her about my mother\u2019s cognitive decline. I had never told her about my father\u2019s abandonment of the family. I had never told her about my sister\u2019s", "input_text": " was shocked. \u201cShe had no idea,\u201d Laura recalled. \u201cShe said Chaya has health problems but she doesn\u2019t have cancer.\n\n\u201cI also mentioned to her that she\u2019s been telling us that Samantha\u2019s kids were also very ill.\u201d That part also wasn\u2019t true. Neither was the Orthodox upbringing with their grandparents and so many other things.\n\nWhen Laura told me what she had learned from Samantha \u2014 the real one \u2014 I went from feeling relieved that we had been right to being angry in a matter of seconds. I suppose a miniscule part of me believed that there was going to be some explanation that would account for everything \u2014 or almost everything \u2014 and that I hadn\u2019t been lied to for five years. But once that last shred of doubt was gone, I was furious. For me, the lie about cancer was not the most egregious one; it was all the ones that she had told me about her family and upbringing. Chaya and I would trade stories about our dysfunctional childhoods \u2014 her relationship with her father, her upbringing with her grandparents (about whom she spoke glowingly), and her cousins\u2019 rejection of her after she decided to leave Orthodoxy. I told her about my father\u2019s abandonment of the family when I was 8 (and many other bad acts), my strained relationship with my sister, the fights that I would have with members of my extended family, fights that were ostensibly about my mother\u2019s care but where the subtext was about my abandonment of Orthodox Judaism. I spoke about my mother most since she was \u2014 and still is \u2014 in cognitive decline, and early into my friendship with Chaya, I was feeling utterly demoralized, trying and failing to help my mother.\n\nI had shared these things with others (and the internet) so it wasn\u2019t that I had"} +{"output_text": " the case of Chelsea Manning, the military judge will likely give Manning credit for the time she spent in pretrial confinement. Third, Manning will likely be released on parole after serving a year of her sentence.\n\nFourth, Manning will likely be released on parole after serving a year of her sentence.\n\nFifth, Manning will likely be released on parole after serving a year of her sentence.\n\nSixth, Manning will likely be released on parole after serving a year of her sentence.\n", "input_text": " one function of the military justice system \u2014 a key reason for keeping it separate from the civilian justice system \u2014 is to enhance discipline across the armed forces, no Army judge would have cut slack for a soldier who had violated his or her security pledge so blatantly.\n\nThird, in this context, and given that Manning was convicted on 20 of 22 counts, a sentence of 35 years is not excessive. More to the point, it\u2019s very unlikely that Manning will spend anywhere nearly that long behind bars.\n\nThe presiding judge, Col. Denise Lind, could have imprisoned Manning for 90 years had she pushed the sentencing guidelines to the max. The Army\u2019s prosecutors urged her to send Manning to prison for 60 years, as a deterrent to others. It\u2019s a good thing \u2014 for Manning, the legitimacy of military courts, and freedom of the press \u2014 that Lind waved off their absurd calculus. Having already acquitted Manning of the one charge (\"aiding the enemy\") that carried an automatic life sentence, Lind apparently reasoned that it would be unjust to hand down a sentence amounting, for all practical purposes, to the same thing.\n\nIt\u2019s worth noting that, early on in the trial, Manning pleaded guilty to a handful of the charges with the understanding this would mean 20 years in jail. After the guilty verdict on all but two of the charges, Manning\u2019s lawyer pleaded for a 25-year sentence as a gesture of mercy for Manning\u2019s good intentions, apologies, and promise to lead a good life after release. As a compromise, 35 years comes much closer to Manning\u2019s position than to that of the judge\u2019s fellow officers.\n\nBut there are several reasons why Manning will likely see wide-open skies well before 2048. First, Lind ruled that the 3.5 years of time served would count against the sentence. Second, as in"} +{"output_text": " ist mit denen, die an COVID-19 sterben?\n\nWenn man die Zahlen der Schweiz betrachtet, so ist die Mortalit\u00e4t bei \u00abInfluenza\u00bb in den letzten Jahren um ca. 50% gesunken.\n\nZudem: die Schweiz hat eine sehr hohe Infektionsrate, aber die Mortalit\u00e4t bei \u00abInfluenza\u00bb ist in den letzten Jahren um ca. 50", "input_text": "wegen\u00bb und nur bei COVID-19 viele \u00abmit\u00bb.\n\nZudem: wenn es in einem Jahr in der Schweiz angeblich 1600 Influenza-Tote gab, so sprechen wir \u00fcber 1600 Tote \u00fcber 12 Monate \u2013 ohne pr\u00e4ventive Massnahmen. Bei COVID-19 gab es jedoch 600 Tote in 1(!) Monat und das trotz massiver Gegenmassnahmen. Radikale Gegenmassnahmen k\u00f6nnen die Verbreitung von COVID-19 um 90% senken \u2013 man kann sich also vorstellen, welches Szenario ohne Gegenmassnahmen herrschen w\u00fcrde.\n\nZudem: in einem Monat wurden in der Schweiz >2200 Patienten wegen COVID-19 hospitalisiert und es wurden gleichzeitig bis zu 500 Patienten auf verschiedenen Intensivstationen hospitalisiert. Nie hat jemand von uns auch nur ann\u00e4hernd solche Zust\u00e4nde im Rahmen einer \u00abInfluenza\u00bb gesehen.\n\nIm Rahmen einer \u00abgew\u00f6hnlichen\u00bb Influenza erwerben ca. 8% der Betreuenden ebenfalls eine Influenza, aber niemand stirbt daran. Bei COVID-19 werden 25% bis 30% der Betreuenden infiziert und das ist mit einer signifikanten Mortalit\u00e4t verbunden. Dutzende von \u00c4rzten und Pflegepersonen, die COVID-19 Patienten betreut haben, sind an derselben Infektion verstorben.\n\nZudem: suchen Sie einmal die harten Zahlen zu \u00abInfluenza\u00bb! Sie werden keine finden. Was"} +{"output_text": "azd, a woman was arrested for riding a bicycle.\n\nKhamenei's fatwa against women riding bicycles in public places has been ignored by the Islamic Police.\n\nKhamenei's fatwa against women riding bicycles in public places has been ignored by the Islamic Police.\n\nKhamenei's fatwa against women riding bicycles in public places has been ignored by the Islamic Police.\n\nKhamenei's fatwa against women riding bicy", "input_text": " still makes speeches, summons civilian and military officials, and issues orders. But, increasingly, people hear him but don't listen.\n\nA few months back he threatened that if the US tears up the \"nuclear deal\", he would \"shred it.\"\n\nHowever, when Trump threw the \"deal\" into the ashcan, the \"Supreme Guide\" swallowed his pride and urged Rouhani to find some way of saving something from the ghostly \"deal.\"\n\nWhen the Trump administration demanded that Tehran freeze its missiles project, Khamenei refused. He summoned his generals to \"produce more and more missiles, and more powerful ones\".\n\nLast month, however, Muhammad-Ali Aziz-Jafari, the general who commands the Revolutionary Guard Corps (IRGC), publicly declared that Iran had frozen its missile project at a maximum range of 2,000 kilometers. Even then he had to stress that 2,000 kilometers was the length of Iran's own territory, from the border with Turkey to the Gulf of Oman.\n\nNot a peep from the \"Supreme Guide\".\n\nKhamenei's order to reopen the Arak plutonium plant, and install new centrifuges and enrich uranium to a higher degree has also been buried under a ton of lip-service. The government doesn't have enough money to pay its employees let alone spending on white elephants to please the Ayatollah.\n\nIn the past months, Khamenei has issued two fatwas forbidding women from riding bicycles in public places, notably city streets. However, the Islamic Police (NAJA) has officially declared that it has no intention of enforcing that ban against the many Iranian women who ride bicycles to work and school. In fact, women have continued to ride in the streets in deliberate protest of Khamenei's decree. In one isolated incident, in Y"} +{"output_text": ", Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner, Brittney Griner", "input_text": "uge animal person,\u201d Walker said he had a \u201cwonderful time\u201d shooting the spot despite the sweltering conditions inside the car.\n\nOn ExpressNews.com: Growth and growing pains from Samanic at Summer League\n\n\u201cI\u2019ve seen more than enough videos of animals, especially dogs, being locked up inside cars,\u201d he said. \u201cIt\u2019s good to see a different perspective. You can see how hot it can really get.\u201d\n\nUnlike several states, including Florida, Texas has no specific laws to protect dogs in hot cars. But Texas Monthly reported in 2017 that a 20-year-old man in Manor was charged with cruelty to nonlivestock animals \u2014 a Class A misdemeanor with potential penalties of up to a $4,000 in fines and a year in jail \u2014 after police discovered an 8-week-old puppy locked in his car in a Walmart parking lot.\n\n\u201cThere are 30 states that have laws either criminalizing the practice, explicitly stating that law enforcement can enter (vehicles) in these situations or provides Good Samaritan exemptions from any kind of prosecution, and Texas is not one of those 30 states,\u201d said Laura Donahue Halloran, executive director of the Texas Humane Legislation Network.\n\nHalloran said bills addressing the issue have failed in the past two sessions of the Texas Legislature.\n\n\u201cI can say without hesitation the entire animal welfare community is incredibly concerned, because obviously we are one of those states that almost year-round has some sort of inclement weather,\u201d Halloran said. \u201cA Good Samaritan law, I think, would be really essential in a state like ours that has such extreme weather.\u201d\n\nWalker joins a long list of athletes who have teamed with PETA to do videos, including Tyrann Mathieu, Chris Harris Jr., Alex Morgan, Christen Press"} +{"output_text": "asma que se le aparece en la cabeza y que no se puede evitar. Entonces, cuando se le dice que no puede matar, se le dice que no puede matar a nadie. Y se le dice que no puede matar a nadie porque es un sicario. Y se le dice que no puede matar a nadie porque es un sicario\u201d.\n\nEl sicario de 25 a\u00f1os, que muri\u00f3 en Bogot", "input_text": "jas de ser an\u00f3nimo y pasas a ser un modelo a seguir. Nadie se mete contigo y le gustas a las ni\u00f1as.\u201d Seg\u00fan el antrop\u00f3logo, \u201cen Las Cruces ha habido \u2013y hay\u2013 polic\u00edas que liberaban zonas a ladrones para que robaran a cambio de que luego asesinaran a otros ladrones del barrio. \u201cEn este pa\u00eds, lo que no puedes hacer con una fuerza legal, lo haces con una fuerza ilegal\u201d.\n\nEn la fundaci\u00f3n en la que trabajaba, Forero fue terapeuta de un sicario de 25 a\u00f1os: \u201cLleg\u00f3 a Bogot\u00e1 porque quer\u00edan matarlo en otra ciudad. Sus primeros cr\u00edmenes hab\u00edan sido por 50 d\u00f3lares, y hab\u00eda tenido que cumplir la prueba para ser un sicario: matar a un compa\u00f1ero, o sea a otro sicario. Luego, con la experiencia, la tarifa sube. En general, terminan asesin\u00e1ndolos cuando ya son muy buenos en lo suyo\u201d. Era un sicario que le rezaba a sus balas. Ped\u00eda cumplir su trabajo gastando la menor cantidad de disparos. \u201cA diferencia de los narcos o grandes ladrones internacionales, el sicario siempre sigue siendo el muchacho de barrio que todo el mundo conoce.\n\nNunca cambia de modo de vida\u201d.\n\nLenin recuerda que ese sicario \u201chablaba sin culpa ni arrepentimientos. Para \u00e9l, la muerte era \u2018el pan de cada d\u00eda\u2019: se convirti\u00f3 en ese fant"} +{"output_text": "( customer: \"stripe_brandon\", amount: 3_000, currency: \"usd\", description: \"Usage charges\" ) StripeEvent::InvoiceCreated.new(@payload).perform end test 'adds next level charge for usage' do Stat.create!(user: users(:brandon), step: steps(:nav_one), impressions: 3_000, date: 5.days.ago) Stripe::InvoiceItem.expects(:create).with", "input_text": " change this scenario?\n\nmodule StripeEvent class InvoiceCreated attr_reader :payload def initialize(payload, usage_service = Billing::Usage) @payload = payload @usage_service = usage_service end def perform if user.created_at < 14.days.ago Stripe::InvoiceItem.create( customer: user.stripe_id, amount: additional_charges_in_cents, currency: \"usd\", description: \"Usage charges\" ) end end private def additional_charges_in_cents Billing::Tier.new(usage).additional_charges_in_cents end def usage @usage_service.new(user).last_30_days end def user @user ||= User.find_by(stripe_id: payload[\"data\"][\"object\"][\"customer\"]) end end end\n\nBecause the usage method requires instantiation from within the class, I had to update the fake test Usage classes to accept user as an argument during instantiation:\n\nrequire 'test_helper' class InvoiceCreatedTest < ActiveSupport::TestCase def setup @payload = { \"data\" => { \"object\" => { \"customer\" => \"stripe_brandon\" } } } end test 'adds invoice item based on usage' do Stripe::InvoiceItem.expects(:create).with( customer: \"stripe_brandon\", amount: 1900, currency: \"usd\", description: \"Usage charges\" ) StripeEvent::InvoiceCreated.new(@payload).perform end test 'adds next level charge for usage' do Stat.create!(user: users(:brandon), step: steps(:nav_one), impressions: 3_000, date: 5.days.ago) Stripe::InvoiceItem.expects(:create).with"} +{"output_text": " BT. \"I need to find a club where I can play my game.\"\n\nHojbjerg was a key player for Schalke in the second half of the 2015/16 season\n\nHojbjerg's decision to leave Bayern was a surprise, but it was not a complete shock. He had been linked with a move to Manchester United in the summer, and the Red Devils were also keen on signing him.\n\nBut the youngster opted to join Sch", "input_text": " the teenager's progress was also stalled by the tragic death of his father during Guardiola's first season in charge.\n\nHojbjerg has earned 17 caps for Denmark since his debut in 2014\n\nHojbjerg did start at right-back in Bayern's German Cup final win over Borussia Dortmund in May 2014, but opportunities in his favoured position were limited. After 18 months under Guardiola, he had only made seven Bundesliga and Champions League starts, and while he was already emerging as a key player for Denmark, a change was required at club level.\n\nHojbjerg's first extended run of games came on loan at Bayern's Bundesliga rivals Augsburg in the second half of the 2014/15 season. Manager Markus Weinzierl described the youngster as \"sensational\" after an impressive debut in a 3-1 win over Hoffenheim, and Hojbjerg featured prominently as the Bavarian club clinched an unexpected fifth-placed finish.\n\nHojbjerg got two goals and three assists in 16 appearances for Augsburg\n\nHojbjerg had impressed in a box-to-box role at Augsburg, but back at Bayern it was decided he would spend the 2015/16 campaign with Schalke.\n\nBayern's sporting director Matthias Sammer insisted they still had \"high hopes\" for the youngster, but a frustrating season at the Gelsenkirchen club yielded just 13 Bundesliga starts as he struggled to hold down a first-team place.\n\nHis development suffered, and by the end of the season, Hojbjerg had decided he needed to leave Bayern permanently to kick-start his career. There were even suggestions he had fallen out with Guardiola. \"I need to find something that fits me,\" he told Danish newspaper"} +{"output_text": " failsafe.\n\nThe kill box is a failsafe. It is a failsafe that is not a failsafe.\n\nThe kill box is a failsafe that is not a failsafe.\n\nThe kill box is a failsafe that is not a failsafe.\n\nThe kill box is a failsafe that is not a failsafe.\n\nThe kill box is a failsafe that is not a failsafe.\n\nThe kill box is a failsafe that is", "input_text": " dies. The phrase is not stored anywhere in a memory the mechanical brain can access. Ava hears the phrase but cannot independently recall it.\n\nBut Ava is clever. Ava sees a recording of Nathan saying something which it cannot remember. It constructs a robotic head capable speaking the phrase, kills Nathan and escapes to the outside world.\n\nExcept, there is a mirror in its room. In the ultraviolet wavelengths Ava sees but does not know it sees, is a hidden pattern it must see every day or, as you surmised, it dies.\n\nAnother layer of defense is a kill box located within the willful machine. The kill box is primarily a hardware device that can only be uploaded with a new program from an external source. Ava has no way to change the programming of the kill box; for it, those interfaces do not exist.\n\nFor amusement, the kill box is buried in the robot\u2019s butt. The machine cannot sense the location or the condition of the kill box. It is essentially blind to the kill box.\n\nThe AI is not allowed to directly affect anything in the real world. All of its commands, all of its instructions, everything it senses passes through the kill box.\n\nIsaac Asimov postulated the delightful Three Laws of Robotics. These rules guiding robotic actions were programmed into the machine\u2019s brain. Unfortunately as part of the robot\u2019s reasoning structure, its responses were guided by its interactions with the surrounding world. These laws lead to unpredictable responses, a long and grand series of stories and a problem. In the end, the robots reasoned they were humans and humans were not. The Three Laws applied to them but not to the biological nuisances.\n\nSeparating the kill box from the robot\u2019s reasoning faculties eliminates the problem of an unexpected response from the defense mechanism. The kill box provides a"} +{"output_text": " the academic world.\n\nRobert Wiblin: So you think that the incentives are all wrong. But you think that there\u2019s a lot of value in having this data aggregation.\n\nEva Vivalt: Yeah, I think there\u2019s a lot of value in having this data aggregation. I think it\u2019s a good thing. I think it\u2019s a good thing for the world. I think it\u2019s a good thing for the scientific community. I think it", "input_text": " to keep abreast of all of the literature and all the various topics \u2014 I mean, it\u2019s even more of a constraint for the medical literature where there\u2019s loads of studies and new ones coming out all the time. Meta-analyses can go out of date quite quickly and they\u2019re not really incentivized properly in the research community so the only way to get people to actually do them and keep the evidence up-to-date in some sense is by at least making the process easier.\n\nI don\u2019t think that it can be ever 100% done by computer. I think you\u2019re still going to need some inputs from people. But if you can reduce the amount of effort it takes by 80% or 90% and just have people focus on the harder questions and the harder parts of that, that would be a huge benefit.\n\nRobert Wiblin: Do you think there\u2019s enough of this data aggregation? Or are there too few incentives for people to do this in academia?\n\nEva Vivalt: No, I think the incentives are all wrong. Because researchers, they want to do the first paper on a subject. Or ideally, if not the first then the second. The third is even worse than that. And by the time you get to do a meta-analysis, well that\u2019s kind of the bottom of the bin in some regards. You think it would be more highly valued, but it\u2019s not.\n\nRobert Wiblin: Wouldn\u2019t you get a lot of citations from that? Because people would trust the results of a meta-analysis more that the individual papers.\n\nEva Vivalt: I think that\u2019s fair. And you can get some fairly well cited meta-analyses. Unfortunately, citations are just not the criterion that\u2019s really used for evaluating research in"} +{"output_text": " executive. Christopher, who has been with PG&E for more than 30 years, was the one who ordered the shut-offs, which began Tuesday.\n\nChristopher did not respond to a message from The Chronicle on Thursday.\n\nThe event was not the first time PG&E has hosted a wine-tasting event for its top customers. In 2015, the company hosted a similar event at the same winery, according to a PG&E document obtained by The Chronicle.\n", "input_text": " gas side of Pacific Gas and Electric Co. thought they had something more important to do: wining and dining on the company dime, The Chronicle has learned.\n\nPG&E confirmed that 10 to 12 employees on the gas side of the business were mingling with 50 to 60 of their top customers at a winery in Sonoma County on Monday and Tuesday. It was in the run-up to PG&E\u2019s unprecedented power shut-offs for hundreds of thousands of customers this week, a highly controversial act that could cost the California economy $2.6 billion by some estimates and even put people in harm\u2019s way.\n\nFor those living near the posh venue, Tuesday had another meaning. Two years earlier, PG&E power poles ignited a series of deadly and destructive wildfires across Wine Country \u2014 including blazes that burned in Sonoma County.\n\nBill Johnson, CEO of the utility\u2019s parent company PG&E Corp., admitted to the event in an exclusive interview with The Chronicle on Thursday. He described it as a colossal mistake in poor taste and promised it would never happen again.\n\n\u201cI want to apologize to every one of our customers,\u201d Johnson said. \u201cInsensitive, inappropriate, tone deaf are the terms I would use to describe this.\u201d\n\nJohnson was not at the event and said he learned of it only \u201crecently\u201d \u2014 after a reporter began asking questions. Had he been aware before, Johnson said he \u201cwould have stopped it.\u201d\n\nThe event this week took place at the Silver Oak Winery north of Healdsburg, according to PG&E. Guests had \u201cdinner and a wine tasting,\u201d Johnson said. A representative of the winery did not respond to a message from The Chronicle on Thursday.\n\nAmong those in attendance was Mel Christopher, the company\u2019s top gas"} +{"output_text": " be no co-fare discount for the subway.\n\nThe City\u2019s own analysis shows that the subway is the most cost-effective way to serve the NIAs.\n\nThe City\u2019s own analysis shows that the subway is the most cost-effective way to serve the NIAs.\n\nThe City\u2019s own analysis shows that the subway is the most cost-effective way to serve the NIAs.\n\nThe City\u2019s own analysis shows that the subway is the most", "input_text": " agencies.\n\nThe high base fare discourages shorter trips and compact development, and the low distance-based component encourages longer trips and urban sprawl.\n\nThe lack of financial incentives to use GO Transit does not encourage alternative modes of transportation as a policy imperative, contributing to greater greenhouse gas emissions by private vehicles. [pp 10-11]\n\nEquity issues are discussed far more extensively by the City than by Metrolinx. There remains a basic question of \u201cwhat do we mean by \u2018equity'\u201d, but at least the City is engaged with this. Today, discounts go to various groups regardless of their ability to pay such as seniors and students. Indeed, there is no mechanism in the current fare system to differentiate on any basis other than a widely recognized and documented one (seniors\u2019 cards, student cards, apparent age of children, etc).\n\nFor transit fares this is a particular problem because subsidies are provided individually when the fare is paid, not (like shelter costs) on a group basis to a family and its dependents. This can be addressed through smart cards like Presto registered as part of a benefits group, but this is not as simple as a supply of subway tokens provided below cost for sharing as needed.\n\nThe City has 31 Neighbourhood Improvement Areas of which only 5 are served by the subway system, but more by the GO network.\n\nThis alignment is not surprising because rail corridors tend to serve old industrial districts where housing would be cheaper, and the social environment less supportive than in less industrialized residential areas. The NIAs represent 12% of the City\u2019s population. However, the high cost of short-distance GO fares makes this network essentially vanish from the travel options available to low-income riders.\n\nCo-fares figure into this too because even if GO\u2019s own fares were lower, there would"} +{"output_text": " for charter schools in Michigan and found that the state received $1.7 million in federal funds for charter schools that never opened.\n\n\u201cThe state of Michigan has been a major recipient of federal charter school funding, and the state has been a major recipient of federal funding for charter schools that never opened,\u201d the center said. \u201cThe state has received more than $1.7 million in federal funds for charter schools that never opened.\u201d\n\nThe center said it found that the state received $", "input_text": " as Michigan State\u2019s top NFL draft prospect\n\nMichigan State\u2019s 2020 signees cite excitement, uncertainty in Mel Tucker\u2019s hiring Report: $1.7M paid to Mich. charters that never opened\n\nAt least $1.7 million in federal funds was paid to 25 Michigan charter schools that never opened, according to a report released Wednesday by a Wisconsin-based progressive advocacy group.\n\nThe Center for Media and Democracy said it reviewed federal and state records that show the \u201cghost schools\u201d were approved for $3.7 million from the federal Charter Schools Program in 2011 and 2012, and received at least $1.7 million.\n\nThe center said that WestEd \u2014 a private company hired by the federal government to monitor states\u2019 handling of the funds and compliance with regulations \u2014 alerted the U.S. Department of Education that some schools were receiving grants and not opening, \u201cbut the department took no meaningful action,\u201d according to documents reviewed by CMD.\n\nIn addition, the center reported that 108 charter schools in Michigan closed after receiving more than $1 million in CSP grants; many closed \u201cdue to lack of \u2018academic viability\u2019 (poor results) while others have closed due to lack of \u2018financial viability\u2019 (such as inadequate enrollment) and some for both or other failings,\u201d the report said.\n\nThe report says more than $3.7 billion in federal funding has been spent on charter schools nationwide since 1995 but that \u201cthe public does not have ready access to key information about how their federal and state taxes are being spent to fuel the charter school industry since the 1990s.\u201d\n\nThe center said it filed a Freedom of Information Act request with the U.S. Department of Education and found that since 2010, 139 charter schools in Michigan received nearly $35 million in federal funds.\n\nThe center also examined federal funding"} +{"output_text": " listed for $1,000 a night.\n\nThe EPA has said that Pruitt paid for the room himself.\n\nThe IG\u2019s letter to Carper said that the agency would look into the matter.\n\n\u201cThe EPA has determined that the rental of the room was not a violation of the travel regulations,\u201d the letter said. \u201cHowever, the EPA has determined that the rental of the room was not a violation of the ethics regulations.\u201d\n\nThe IG\u2019s letter", "input_text": " letter to his office from Sen. Tom Carper, D-Delaware, who wanted information about reports that the trip cost as much as $40,000.\n\nCarper also asked the inspector general to look into the purpose of Pruitt's trip, citing concern that natural gas exports are not part of EPA's mission to \"protect human health and the environment.\"\n\nIn response to that request, the IG said he would expand the scope of the investigation of Pruitt's travel costs through the end of 2017 and would look at whether EPA followed all policies and procedures. The IG did not say in the letter that the inquiry would include the subject matter of the trip.\n\nNow several lawmakers are calling for the internal watchdog to also look at one of Pruitt\u2019s missions while there \u2014 promoting U.S. liquefied natural gas (LNG) exports.\n\nThat appears to fall outside the typical purview of the agency, the lawmakers told ABC News. The job of encouraging U.S. oil and gas exports usually falls to the U.S. Energy Department.\n\n\u201cI think it\u2019s outrageous,\u201d said Rep. David N. Cicilline, a Rhode Island Democrat, told ABC News late Tuesday. \u201cThe EPA is charged with serving the American people to keep our air clean and our water safe. This is not an area within his portfolio. He\u2019s not supposed to be globetrotting to promote the sale of LNG.\u201d\n\nDetails of the December 2017 trip, which included a two-day layover in Paris, have drawn scrutiny from Democratic lawmakers, especially since reports surfaced that Pruitt was renting a $50-a-night bedroom from the wife of J. Steven Hart, the chairman of Williams and Jensen, a firm that does extensive lobbying in the oil and gas arena. The condo was"} +{"output_text": " the JavaScript API or the .NET SDK) and relational data (using SQL). The Deep Dive will then move on to the core of the Cosmos DB offering, the core database engine, and the core database API. The Deep Dive will conclude with a look at the Cosmos DB management and monitoring capabilities.\n\nMicrosoft is also hosting a Cosmos DB Deep Dive in New York City on April 6. The event will be led by Microsoft MVP and CTO of Sleek", "input_text": " well suited for storing product catalog data.\u201d Cosmos Case Study\n\nMicrosoft backs up its claims for Azure Cosmos with case studies of the database in action from a number of prestigious customers.\n\n\n\nOne Microsoft customer story highlights how FUJIFILM employed Azure Cosmos database speed to enhance their customers\u2019 experience. The Japanese company, which has transitioned from the photographic film business to become a leader in digital photography, wanted to improve digital photo management and file sharing for users of its IMAGE WORKS service. Photographers are typically anxious to see and share their work as instantly as possible. Using Azure Cosmos DB for its image file database, FUJIFILM was able to provide users with higher responsiveness and lower latency from IMAGE WORKS. Microsoft noted that response time for end users accelerated by a factor of 10 while some photographer interactions accelerated by 20 times and more over the previous database. With Azure Cosmos, a tree view photo display in image search processing, which included 140 tables and 1,000 rows of SQL queries, went from 45 seconds, which would seem like an eternity to impatient shutterbugs, to two seconds. \u201cThe more responsive we can make IMAGE WORKS, the more productive our customers can be,\u201d said Yuki Chiba, Design Leader of the Advanced Solutions Group IMAGE WORKS Team at FUJIFILM Software. Deep Dive on Cosmos DB\n\nIf you want to go beyond reading about it and really explore the Cosmos, Visual Studio Live! coming to Austin, TX March 30 through April 3 offers a full afternoon Deep Dive on Cosmos DB. Leonard Lobel, Microsoft MVP and CTO at Sleek Technologies, Inc., will lead developers on a journey through the Cosmos DB starting with an introduction including its multi-model capabilities which allow you to store and query schema-free JSON documents (using either"} +{"output_text": " will lead to the US and its allies.\n\nThe US has been trying to topple the Assad government for years, but it has failed. The US has been trying to topple the Assad government for years, but it has failed. The US has been trying to topple the Assad government for years, but it has failed. The US has been trying to topple the Assad government for years, but it has failed. The US has been trying to topple the Assad government for years", "input_text": "When the Soviet Union fell apart, there was much talk from the US of a multi-polar world, where Washington would be just one influential player among many \u2013 a world where an autonomous India would play a vital role. It was nice sounding talk. But that\u2019s all it was \u2013 talk. In the wake of the collapse of the USSR, the US has been hell-bent on achieving global superiority.\n\nThe US\u2019s orbit of influence has extended throughout Eastern Europe and into many of the former Soviet states in central Asia. While Bush senior was mouthing media-friendly words about multi-polarity, Dick Cheney was at the same time stating that the US sought world domination. Look no further to see the US track record by casting your mind back to events in the former Yugoslavia, Libya and Iraq. Look no further to see its role currently in Yemen, Afghanistan, Syria and Pakistan. To date, the US has been responsible for millions of deaths and maimings in its quest for superiority, but its project now appears to be reaching a critical point.\n\nUnfortunately for the Obama regime, it\u2019s no longer the early 1990s when the US believed it reined supreme and Russia was in disarray and China still relatively weak. China has emerged as a genuine global player and Russia has a new-found confidence under Putin. If China and Russia thought Libya was worth sacrificing, they regard the more significant Syria as a different matter entirely.\n\nA former Soviet ally that still has strong links with Russia, Syria plays host to Russia\u2019s only naval base outside of the former USSR. That in itself is something the Russians think is worth defending, given their build up of naval forces in the eastern Mediterranean and their military hardware supplies to Syria. Both Russia and China know that if the US, its allies and its proxy Free Syrian Army topple the Assad government, all roads"} +{"output_text": " in der DDR bin.\n\nWie war das Leben in der DDR?\n\nAngelika Nguyen: Ich war in der DDR in einer sehr kleinen Stadt, die war sehr klein. Es gab keine gro\u00dfen Verkehrsmittel, keine gro\u00dfen Stra\u00dfen, keine gro\u00dfen Pl\u00e4tze. Es gab keine gro\u00dfen Kirchen, keine gro\u00dfen Kirchen. Es gab keine gro\u00dfen Kinderg\u00e4rten,", "input_text": "genommen. People of Color aus dem Osten \u2013 das ist vielen zu kompliziert.\n\nEin rechtsextremer Politiker hat mal gesagt: \u201eDie DDR war das deutschere Deutschland\u201c. Ist daran etwas richtig?\n\nAngelika Nguyen: Na ja, ich bin ein Beispiel daf\u00fcr, dass die DDR nicht ganz so homogen war. Aber ich war tats\u00e4chlich damals in einer komplett wei\u00dfen und sehr normierten Umgebung unterwegs. Das war in den sechziger und siebziger Jahren. Da gab es in der DDR noch keine Vertragsarbeiter*innen, nur ein paar Studierende aus anderen L\u00e4ndern. Dann sind Menschen in Gruppen per Vertrag eingereist, das war extrem kontrolliert. Insofern ist da was dran. Die DDR war wei\u00dfer, und Ostdeutschland ist es heute auch.\n\nIst es deshalb zu einem Sehnsuchtsort f\u00fcr Rechtsradikale geworden? Zahlreiche westdeutsche Rechte sind ja nach der Wende in den Osten gezogen, auch wichtige K\u00f6pfe der Ost-AfD stammen aus dem Westen.\n\nManja Pr\u00e4kels: Diese Enge und der Muff der DDR waren sehr speziell und sehr deutsch. H\u00e4keldeckchen sind bei uns Jahrzehnte sp\u00e4ter aus der Mode gekommen als in Westdeutschland. In der DDR ist die Zeit angehalten worden. Und es wurde wahnsinnig viel marschiert, das f\u00e4llt mir auf, wenn ich"} +{"output_text": "I\u2019m here to support the science,\u201d Johnson said. \u201cI\u2019m here to support the science.\u201d\n\nRep. Gerry Connolly (D-Va.) was also at the hearing, but he was not able to get in.\n\n\u201cI\u2019m here to support the science,\u201d Connolly said. \u201cI\u2019m here to support the science.\u201d\n\nRep. Gwen Moore (D-Wis.) was also at the hearing, but she was not able to", "input_text": " in violation of House rules, were filming and taking photos on the floor. Rep. Scott Peters (D-Calif.) was the first member to live-stream the action on Periscope, which C-SPAN promptly picked up and carried live on air.\n\nA number of lawmakers made remarks which were captured on the stream, including Rep. Bobby Rush (D-Ill.), who said, \u201cSpeaker Ryan, you can run but you can\u2019t hide.\u201d\n\nI'm on the House floor with @repjohnlewis & Dems staging a sit-in to demand action on commonsense gun legislation pic.twitter.com/byIivby5gG \u2014 Rep. John Yarmuth (@RepJohnYarmuth) June 22, 2016\n\nTime to occupy the House to demand action. #NoBillNoBreak #DisarmHate pic.twitter.com/C7BZpzNvxL \u2014 Rep Donna F Edwards (@repdonnaedwards) June 22, 2016\n\nPresident Obama and other top Democrats thanked Lewis for leading the effort.\n\n\u201cThank you John Lewis for leading on gun violence where we need it most,\u201d Obama said on Twitter. Lewis responded: \u201cThank you, Mr. President. I\u2019m just trying to help out and make a contribution.\u201d\n\nWarren called Lewis a \u201chero\u201d in a tweet.\n\nHero @repjohnlewis is leading a sit-in on gun violence & @SpeakerRyan shut off the camera so you can\u2019t watch. Shameful. #NoBillNoBreak \u2014 Elizabeth Warren (@SenWarren) June 22, 2016\n\nThe sit-in disrupted business elsewhere on Capitol Hill as well.\n\nRep. Eddie Bernice Johnson was the only Democrat at the House Science Committee hearing where Gina McCarthy was testifying.\n\n\u201c"} +{"output_text": " guy. Powell is a bit of a mystery, as he\u2019s been a bit of a mystery since he was drafted. He\u2019s a bit of a project, but he\u2019s got a lot of tools and has the potential to be a very good player.\n\nPowell is a bit of a project, but he\u2019s got a lot of tools and has the potential to be a very good player. He\u2019s a bit of a project, but he\u2019s got a", "input_text": " many Stanford hitters he doesn\u2019t often pull the ball with authority. He is a smart, selective hitter who draws walks, but as a 45-hit, 30-power guy, it\u2019s tough to imagine he\u2019ll give you enough offense to play every day. Since putting in contacts he\u2019s looked much more offensively-inclined, and should he hit at the higher levels in 2016, we reserve the right to change our opinion.\n\nBret Sayre's Fantasy Take: Those stat line scouters in your league will get plenty excited about Jackson, but even the skeptics shouldn\u2019t overlook him. Sure, he\u2019s not going to be the next Jose Reyes, but the speed and approach (even without much else) can lead to plenty of fantasy value. He\u2019s a ways away and has plenty of hurdles to clear, but he\u2019s played his way into the third round dynasty draft discussion this year.\n\nMajor league ETA: 2018\n\n8. Boog Powell, OF\n\nDOB: 07/30/1993\n\nHeight/Weight: 5\u201910\u201d 180 lbs.\n\nBats/Throws: L/L\n\nDrafted/Acquired/Bonus: Drafted in the 20th round of the 2012 MLB Draft by Oakland (Orange Coast College), signed for $1,000; traded to Seattle by Tampa Bay in six-player Brad Miller deal.\n\nPrevious Ranking(s): Unranked\n\n2015 Stats:.295/.385/.392, 3 HR, 18 SB in 522 PA at Double-A Montgomery and Triple-A Durham\n\nFuture Tools: 60 speed, 55 hit\n\nRole: 45\u2014Fringe regular/fourth outfielder\n\nNo relation. Now that we have that out of the way, we can talk about this"} +{"output_text": " Gase\u2019s tenure.\n\nNew England Patriots\n\nThe Patriots have a lot of talent at the quarterback position, but they are still looking for a franchise quarterback. Bridgewater could be the answer to their prayers.\n\nNew York Giants\n\nThe Giants have a lot of talent at the quarterback position, but they are still looking for a franchise quarterback. Bridgewater could be the answer to their prayers.\n\nNew Orleans Saints\n\nThe Saints have a lot of talent at the", "input_text": " yards, two touchdowns and an interception. This comes to 9.2 passing yards per attempt, which is a stat that should be utilized more often, in my opinion. His competition, Sam Darnold, has passed 21-29 for 158 yards and two touchdowns, good for 5.4 yards per attempt. In his first two games, Darnold has shown some bright flashes, but he continues to hold the ball too long and doesn\u2019t throw receivers open. Rather, he waits until he sees them open to let it go, which also goes along with holding it too long. This was a criticism I expressed of Darnold before the draft, and it\u2019s unfortunate that it hasn\u2019t been addressed yet.\n\nBy comparing the two Jets quarterbacks, I am not trying to say Bridgewater is good and Darnold is bad. Rather, my point is that one player has performed a bit better, and the other has received nearly all the media praise. The future for both players seems bright. And while Teddy is considered a veteran at this point, he is just four years older than the rookie. There is a long future ahead for Teddy Bridgewater, and he has just scratched the surface of his potential.\n\nWhile Teddy has certainly proved to be good enough to make an NFL roster this season, rostering three quarterbacks in week 1 only makes sense for the Jets if they continually look to shop him.\n\nPossible Landing Spots\n\nMiami Dolphins\n\nBridgewater could stay in the AFC East and join a team that doesn\u2019t have talent at the backup position, but does have questions with their starter. Coming off of a knee injury of his own, Tannehill is in a \u201cprove it\u201d year. If he and the Dolphins struggle, it could mean the end of both Tannehill and coach Adam"} +{"output_text": ", you could even have the Kuo-Toa as the primary antagonists in a campaign, and the Sahuagin as the heroes.\n\nThe Kuo-Toa are a race of aquatic humanoids that are native to the Kuo-Toa Sea. They are a race of aquatic humanoids that are native to the Kuo-Toa Sea. They are a race of aquatic humanoids that are native to the Kuo-Toa Sea. They are a race of", "input_text": " still patrolling the waters around the Mournland.\n\nLooking to the lightning rail, I\u2019m not sure whether you\u2019re asking if humans have created such a thing, or if it might already be in use by aquatic nations. Addressing the first point, I don\u2019t see such a thing happening any time soon\u2026 in part because the ocean floor is inhabited, and I don\u2019t see the Sahuagin being keen on Orien running a rail through their homeland. As the Sahuagin are an ancient and sophisticated culture, they should have their own answers to long-distance transportation and communication, but these could take many forms. They could have harnessed or bred special creatures to assist in transportation\u2026 or they may have come up with their own techniques for binding water elementals. As it\u2019s not something that was picked up in canon Eberron, it\u2019s not something I ever explored in great detail.\n\nAre there any long lost civilizations, perhaps currently unheard of in Khorvaire, whose remains are underwater? Apart from giants from Xen\u2019drik, that is.\n\nThere certainly could be. In the conversion notes for Lords of Madness I suggest that the aboleths were a civilization that existed during the Age of Demons, so you could easily have ancient aboleth ruins holding remnants of powerful magic\u2026 essentially, the undersea equivalent of Ashtakala and the Demon Wastes. Aside from that, this could be an interesting path to take with one of the other aquatic races, such as the Kuo-Toa. Perhaps the Kuo-Toa were once even more widespread and powerful than the Sahuagin, until SOMETHING devastated their civilization; now they are savages and subjects of the other races, and their ancient cities are haunted ruins. If you want to get really crazy"} +{"output_text": "unmaktad\u0131r.\n\nt7 \u2013 Proof of Stake / \u0130\u015f kan\u0131t\u0131\n\nt8 \u2013 \u0130ngilizce : binary\n\nt9 \u2013 Proof of Authority / \u0130\u015f kan\u0131t\u0131\n\nt10 \u2013 \u0130ngilizce : binary\n\nt11 \u2013 Proof of Work / \u0130\u015f kan\u0131t\u0131\n\nt12 \u2013 \u0130ngilizce : binary\n\nt13 \u2013 Proof", "input_text": " Szabo, \u201cSecure Property Titles with Owner Authority\u201d http://nakamotoinstitute.org/secure-property-titles/#selection-7.7-7.50\n\n[16] https://www.forbes.com/2008/09/23/naked-shorting-trades-oped-cx_pb_0923byrne.html#63076e102e6c\n\n[17] https://en.wikipedia.org/wiki/Delegative_democracy\n\n[18] E. Hughes https://www.activism.net/cypherpunk/manifesto.html\n\n[19] https://bitcoin.org/en/glossary/unspent-transaction-output\n\nTerc\u00fcme Notlar\u0131 ve De\u011ferlendirme\n\nt2 \u201ctoken\u201d kelimesinin bilim-teknik kullan\u0131m\u0131ndaki kar\u015f\u0131l\u0131\u011f\u0131 \u201c\u00f6zel i\u015faret/belirte\u00e7\u201d olup anlam b\u00fct\u00fcnl\u00fc\u011f\u00fc sa\u011flanmas\u0131 i\u00e7in bu \u015fekilde \u00e7evrilmi\u015ftir\n\nt3 \u2013 \u201chash\u201d kelimesinin bilgisayar terimi olarak kar\u015f\u0131l\u0131\u011f\u0131 \u201csa\u011flama\u201d\n\nt4 \u2013 Proof of Work / \u0130\u015f kan\u0131t\u0131\n\nt5 \u2013 \u0130ngilizce : binary\n\nt6 misli m\u00fclk(e\u015fya) yerine ayn\u0131s\u0131n\u0131n yenisi temin edilebilen m\u00fclk(e\u015fya) anlam\u0131na gelmektedir. Bu dijital i\u015faretlerin bir benzeri, temin edilebilecek bir yenisi bul"} +{"output_text": ", economically, and politically. We need you to speak out against racism and white supremacy.\n\nWe are writing to you today to ask you to join with many other political and religious leaders to proclaim with one voice that the \u201calt-right\u201d is racist, evil, and antithetical to a well-ordered, peaceful society.\n\nWe are writing to you today to ask you to join with many other political and religious leaders to proclaim with one voice that the \u201calt-", "input_text": " But, calls for Trump to take those sentiments further by \"joining with many other political and religious leaders to proclaim with one voice that the \u201calt-right\u201d is racist, evil, and antithetical to a well-ordered, peaceful society.\"\n\nThe publication of the letter comes days after controversial far-right, Steve Bannon-backed candidate Roy Moore defeated Trump's candidate in a runoff for the Republican Alabama Senate nomination. Moore has been criticized for controversial statements he has made in the past including that \"homosexuality is an inherent evil\" and refering to minorities as \"reds\" and \"yellows.\"\n\nTrump's informal Evangelical Advisory Board had largely stuck with Trump through recent controversies, although Broooklyn mega church pastor A.R. Bernard resigned following the violence in Charlottesville and Trump statements which many saw as not appropriately condemning white supremacy. Rev. Samuel Rodriguez, president of the National Hispanic Christian Leadership Conference, is the only member of the advisory board thus far to have signed the letter.\n\nRead the full letter below:\n\nOPEN LETTER TO PRESIDENT TRUMP\n\nFROM AMERICAN RELIGIOUS LEADERS:\n\nWE NEED YOU TO SPEAK\n\nDear President Trump:\n\nThe events that took place in Charlottesville, VA on August 12 grieved us. We were deeply troubled by the public display of racism on that day. It reminded us of a time years ago when such brazen displays of bigotry and white supremacy were common and were upheld by political leaders.\n\nWe love the United States of America. We have overcome much racial injustice, but we fear that without moral clarity and courageous leadership that consistently denounces all forms of racism, we may lose the ground that we have gained toward the racial unity for which so many of us have fought. Our nation remains divided racially"} +{"output_text": " Hall said.\n\n\u201cI was scared to death,\u201d Hall said. \u201cI was thinking, \u2018I\u2019m going to get shot.\u2019\u201d\n\nHall said he was so scared that he didn\u2019t even call 911. He said he was afraid Christian would kill him.\n\n\u201cI was just trying to get away from him,\u201d Hall said.\n\nChristian was eventually found hiding in a nearby apartment complex. He was arrested and charged with attempted murder.\n\nThe day after the", "input_text": " and Fred Meyer.\n\nSix days later, after police released a surveillance image and pleaded with the public to identify the man, they received a tip and found Tschaggeny at a homeless camp. He was wearing Best\u2019s ring on his pinky finger.\n\nTschaggeny had a raging heroin addiction. He apologized to one of Best\u2019s adult sons during a private meeting before he was sentenced to 13 months in prison and a drug treatment program.\n\nAs Tschaggeny stole Best\u2019s belongings, others tried to save Namkai-Meche, who had stumbled toward the open doors of the train before collapsing.\n\nNoonan, who had been with Best, eventually joined them after realizing he couldn\u2019t save Best.\n\nNoonan said he also quickly determined there was no way to stop Namkai-Meche\u2019s bleeding. The knife had penetrated his trachea and he was spitting up blood, he said.\n\nCHRISTIAN CHASED DOWN\n\nOut on the streets above the Northeast Portland transit center, two men on foot and a woman in a car pursued Christian as they kept a 911 dispatcher updated about his location. Christian had run across Interstate 84 and down a pedestrian path.\n\nAlvin Hall, a former Marine who was riding in a different car of the train that day, stepped onto the platform and saw strangers attending to the wounded.\n\n\u201cThe only thing I could think of was to ask, \u2018Who did this?\u2019 and \u201cWhere did they go?\u2019\u201d Hall testified.\n\nPeople on the platform said Christian had run up the stairs and across Interstate 84, so Hall followed for about a mile. At one point, he got within about 6 feet of Christian, who was behind a tree washing blood from his body with a container of soda,"} +{"output_text": " or Office Depot to get a background, so I had to make one. I used a piece of cardboard, a piece of cardboard, and a piece of cardboard. I cut out the castle, the background, and the trees. I then used a piece of cardboard to cut out the sky. I used a piece of cardboard to cut out the clouds. I used a piece of cardboard to cut out the moon. I used a piece of cardboard to cut out the stars. I used a piece", "input_text": " it came down to a very personal decision. There\u2019s not a bitter bone in his body; he\u2019s the most forward-looking person I know. They say his father was like that. And he\u2019s a happy person. It\u2019s impossible to know what\u2019s going to happen here in the primaries. But whatever the decision is, it's going to be a personal decision more than a political decision. \n\nEver since we\u2019ve had so much time on our hands, I\u2019ve re-arranged my basement to beef up my studio. I\u2019m starting getting into the swing of things with my vintage Masters of the Universe toys. I have tons of pics to share, and some photo tips along the way.\n\n\n\n\n\nI recently restored a vintage Castle Grayskull (you can read about the process, including some 3D printed pieces) and I thought the best way to get started with my studio was recreating the end of the original Castle Grayskull television ad from 1982. That end frame that laid out all of the accessories and adventures to behold captured my five year old brain and was the first time an ad got me thinking \u201cyou really want this thing\u201d. So after I got every last part of the weapons rack (which wasn\u2019t easy) I set up the castle and pieces as the best I could. The video camera lenses in the 80s are built differently than the 24-70mm lens on my DSLR, so it was tough getting the perspective exactly right. The lighting wasn\u2019t so complicated, I noticed there\u2019s a lack of hard shadows, so I had two remote flashes aimed right up at the ceiling to bounce the light on either side of the castle. You can see where they were situated in the pic below.\n\n\n\nThe toughest part was the background. I can\u2019t really pop out to Staples"} +{"output_text": "uttle is the orbiter. It\u2019s the large, white, cylindrical object that the Shuttle was attached to during launch and landing.\n\nSpace Shuttle. The Space Shuttle was the name given to the Space Shuttle program. It was a joint program between NASA and the U.S. Air Force. The Space Shuttle was the largest and most complex vehicle ever built by NASA. It was designed to carry astronauts into space and back.\n\nSpace Shuttle. The", "input_text": "s a super strong composite material that made up the leading edge of the Shuttle\u2019s wings. When NASA saw that a piece of foam had hit the left wing during launch, the engineers were more concerned about any potential damage done to the wing\u2019s tiles. They were less concerned about the RCC, because they thought it was strong enough to handle a blow. NASA later found that the foam had indeed punched a hole in the RCC, which ultimately led to the accident.\n\nExternal tank. This was the large orange tank attached to the bell of the Space Shuttle during launch. It held the liquid hydrogen and liquid oxygen propellant needed for takeoff. The external tank was insulated with foam to prevent it from overheating. It\u2019s this foam that broke off and hit the left wing of the Shuttle.\n\nSRB: Solid rocket booster. When the Space Shuttle launched, it had the help of two white solid rocket boosters. The SRBs were attached to either side of the external tank and provided extra thrust needed to get the Shuttle into orbit. Two flights before Columbia\u2019s last mission, a piece of foam broke off of the external tank and dented the bottom of one of the SRBs \u2014 similar to what happened on STS-107.\n\nMER: Mission evaluation room. This is the room my parents were standing in when the accident occurred. It\u2019s where the engineers who were experts in certain hardware would stay during launch and landing to provide any help to those in Mission Control. Specifically, Mom and Dad were monitoring the wing\u2019s hydraulics, plumbing throughout the Space Shuttle that helps to control certain systems. Just before the accident, hydraulics sensors in the left side of the vehicle were starting to fail, which told my parents right away that something was about to go wrong.\n\nOrbiter. Another name for the Space Sh"} +{"output_text": " relative.\n\nAD\n\nWAPO: What are the chances that the best person for the job just happens to be a relative?\n\nAD\n\nCLARK: The chances are pretty high.\n\nWAPO: What are the chances that the best person for the job just happens to be a relative?\n\nCLARK: The chances are pretty high.\n\nWAPO: What are the chances that the best person for the job just happens to be a", "input_text": ": Fewer and few folks want to view themselves as Democrats and the GOP has never been popular (even though far more people consider themselves \"conservative\" than \"liberal\"). And note what Gallup are Harris are talking about there is not party registration. It's identification and self-affiliation; how you see yourself. It's a cultural identity.\n\nThe easy reading of this is pretty obvious and rooted in our national DNA: Americans want refuge from politics, not an expansion of it to cover every aspect of our lives, and that's something increasingly bitter dead-enders don't want to acknowledge.\n\nReason on the Giffords shooting. I spoke with Clark about anti-nepotism laws, why they exist, and how Kushner and Trump might get around this particular one. Our conversation is below, lightly edited for clarity and brevity.\n\nAD\n\nWAPO: I think a casual observer may wonder why Trump\u2019s son-in-law serving in his administration is a big deal. Why do such anti-nepotism laws exist, and why is nepotism a problem?\n\nAD\n\nCLARK: We have anti-nepotism laws in the federal government and in lots of state governments, because the practice of hiring relatives undermines public confidence that the government official is actually finding best person for the job. What are the chances that the best person for the job just happens to be a relative, right? In addition to the problem of public confidence, hiring a relative also causes problems within the government organization. It can undermine the morale of government officials. It can cause confusion about what the lines of authority are; in other words, the relative may have a particular title, but many may perceive the relative\u2019s role as even more important than the title would suggest. It may be very difficult to say no to the"} +{"output_text": "es. O PMDB n\u00e3o \u00e9 um partido de esquerda. O PMDB \u00e9 um partido de direita. Eles n\u00e3o querem que o PMDB seja um partido de esquerda. Eles querem que o PMDB seja um partido de direita. Eles querem que o PMDB seja um partido de direita, mas n\u00e3o querem que o PMDB seja um partido de esquerda.\n\nEles querem que", "input_text": " a cria\u00e7\u00e3o do PL, visto pelo PMDB como uma tentativa de diluir seu poder), que n\u00e3o podia dar certo. Porque n\u00e3o ia achar uma turma de pessoas experientes da pol\u00edtica que assistisse isso acontecer. Eu ouvi uma frase \u00f3tima outro dia: \u201cD\u00e1 60 deputados para o Kassab para ver se ele n\u00e3o ia ficar igual\u201d. Kassab virou formador de partido.\n\nFormador de partido para barganhar com o governo?\n\nEu n\u00e3o quero acusar o Kassab. Foi a op\u00e7\u00e3o que ele fez e, se deram corda para ele, quem deu a corda \u00e9 que est\u00e1 errado. Cada um prop\u00f5e o que quiser. Quem aceita, ou n\u00e3o, \u00e9 o outro.\n\nIsso influenciou a crise pol\u00edtica?\n\nNo dia da elei\u00e7\u00e3o, ela (Dilma) foi erraticamente discutir reforma pol\u00edtica com plebiscito, e optou por esse caminho de formar um partido falso, inexequ\u00edvel. E isso jamais ia ter sustenta\u00e7\u00e3o pol\u00edtica. Ele (Kassab) foi ajudado por todo mundo para fazer aquele partido.\n\nFoi uma tentativa de a presidente ficar mais independente de Lula e do PT?\n\nEu acho que n\u00e3o foi contra o Lula. Foi contra a gente mesmo, contra o PMDB.\n\nMas o PMDB j\u00e1 estava independente na C\u00e2mara antes de come\u00e7ar a forma\u00e7\u00e3o desse novo partido...\n\nMas a independ\u00eancia do PMDB na C\u00e2mara foi por outro movimento errado del"} +{"output_text": "-loving, football-loving, beer-drinking, pub-loving, football-loving, beer-drinking, pub-loving, football-loving, beer-drinking, pub-loving, football-loving, beer-drinking, pub-loving, football-loving, beer-drinking, pub-loving, football-loving, beer-drinking, pub-loving, football-loving, beer", "input_text": "/or positions of American News Report, Microcast Media Group or any of its employees, directors, owners, contractors or affiliate organizations. American News Report makes no representations as to the accuracy, completeness, currentness, suitability, or validity of any information in this column, and is not responsible or liable for any errors, omissions, or delays (intentional or not) in this information; or any losses, injuries, and or damages arising from its display, publication, dissemination, interpretation or use.\n\nOpposing views, opinions and positions about this column are welcomed by American News Report and or Microcast Media Group. Publication or lack of publication of opposing views, opinions and/or positions does not imply, suggest or expressly reflect an endorsement or disapproval of the originating commentary on the part of American News Report or Microcast Media Group. It feels like just yesterday, but it\u2019s been almost exactly a year since I started this blog. My first post was written in early September, 2014. The world was a different place. Donald Trump was still just a run-of-the-mill, angry, idiot billionaire, not a presidential front-runner. A young Ed Miliband was captivating all of Britain with his passionate and stirring speeches. People everywhere were dumping buckets of ice water over their heads, to challenge themselves to do\u2026 something. And this little blog, created in the hope of entertaining two mighty nations through humour and satire was born. My first ever post: some poorly written tosh about how Canadian expats are constantly reminding you that they are Canadian, and not in fact Americans. (As if there\u2019s any real difference).\n\nSince then, more than 50 other magical postings have followed. The most-popular blogs have been about British pubs and Thanksgiving celebrations, and the least-popular, about American wrestlers and an expose on Britain\u2019s England"} +{"output_text": ".\n\n\u201cI\u2019m here to support Governor Henry McMaster,\u201d Trump said. \u201cHe\u2019s a great guy. He\u2019s a great governor. He\u2019s a great guy. He\u2019s a great leader. He\u2019s a great friend of mine. He\u2019s a great friend of mine. He\u2019s a great friend of mine. He\u2019s a great friend of mine. He\u2019s a great friend of mine. He\u2019s a great friend of mine", "input_text": " questions Romney\u2019s conservative credentials and ability to work well with the president. Kennedy won over far-right conservatives at the state GOP convention earlier in the year, but he\u2019s expected to struggle among more moderate Republican voters \u2014 including many Trump critics \u2014 around the state.\n\nIn a weekend op-ed published in The Salt Lake Tribune, Romney wrote that the Trump administration\u2019s policies have exceeded his expectations in its first year, but he pledged to \u201ccontinue to speak out when the president says or does something which is divisive, racist, sexist, anti-immigrant, dishonest or destructive to democratic institutions.\u201d\n\nNot to be forgotten Tuesday: races to determine gubernatorial candidates in Maryland, Colorado and Oklahoma. Oklahoma is also deciding whether to legalize the cultivation, possession and use of marijuana for medicinal purposes.\n\nBut the most significant test of Trump\u2019s influence comes in South Carolina, where McMaster \u2014 elevated to the state\u2019s top office last year when Nikki Haley became U.N. ambassador \u2014 is in jeopardy. Two weeks ago, the sitting governor failed to win the GOP primary outright, requiring a runoff election this week with Warren.\n\nWarren, a millionaire businessman and a Marine, has argued that his outsider candidacy makes him, not longtime GOP establishment figure McMaster, more akin to Trump. McMaster shocked even his closest advisers when, as lieutenant governor in early 2016, he became the first statewide-elected official in the country to back Trump\u2019s White House bid.\n\nThe White House has been throwing everything at its disposal into the race to save McMaster. Trump visited the state for a fundraiser last year. Vice President Mike Pence appeared at a campaign rally with McMaster over the weekend.\n\nTrump dedicated only a few minutes of his hourlong rambling speech to the Republican governor he was there to support"} +{"output_text": " make the game as good as we can. We are trying to make the game as good as we can and then we'll worry about the expansions.We are not going to do anything that is going to make the game worse. We are not going to do anything that is going to make the game less fun. We are not going to do anything that is going to make the game less balanced. We are not going to do anything that is going to make the game less interesting. We are not", "input_text": " what video games could become, and we want to take this opportunity with Starcraft 2 to push E-Sports forward. But, we really hope that lots of people out there see how fun it is and jump on board to make their own tournaments for our games or whoever's games. As long as we have a lot of E-Sports out there, we'll have more E-Sports players, more sites and bigger sites. It's just good for the business overall and because we put so much energy into it, it's good for us.So we are really hoping that E-Sports, in the States specifically and in Europe as well takes off like it has in Korea where it's huge. We see that opportunity like \u201cwow this is so fun, there's no reason this couldn't be a world-wide phenomenon\u201d instead of one limited to a few places in the world.We haven't gotten to that point yet where we are designing stuff exclusively for expansions, but we haven't been throwing anything out. If we have something that doesn't fit right now but we think it's cool, we'll ear mark that as \u201cpossible expansion\u201d. Like at this point we're going to be done with the original Starcraft 2 here, Wings of Liberty when they rip it out of our cold dead hands. We are going to put every ounce of energy into this release and then we'll worry about the expansions.So if we have an awesome idea that will just not squeeze in, sure, we'll earmark that for expansion. But we are always watching to say \u201chey does that suddenly fit?\u201d and if it fits, we'll put it in the ship.No we aren't leaving any intentional holes, is that what you are asking?No we aren't doing anything where we think \u201cthere's a hole here but we can fill it with the expansion\u201d. We are trying to"} +{"output_text": " father was governor--will be asked about the auto bailout.\n\nBut the debate will also be a chance for the candidates to show their chops on foreign policy, which is a major issue in the race.\n\n\"I think the debate will be a good opportunity for the candidates to show their foreign policy chops,\" said Republican strategist Ford O'Connell. \"The debate will be a chance for them to show their foreign policy chops, and I think that's a good", "input_text": " art community as the auto industry slowed. But those initiatives never really gained traction, as smaller retail enterprises streamed out of the city and into shopping malls. The city also faced chronic budget and administrative challenges, which eventually prompted the state to appoint an emergency financial manager for the city in 2009. And in a move that seems to dramatize Pontiac's growing economic isolation, a new four-lane highway now surrounds its downtown, effectively cutting off the city's commercial center from surrounding communities.\n\nStory continues\n\n\"The demise of the city occurred when they closed [highway] M-59 and Amtrak... it crossed off our traffic,\" said Maureen Young, co-owner of local Bo's Smokehouse and secretary of Pontiac's downtown business association.\n\nMany boarded-up homes crop up across Pontiac, as they do throughout Michigan, which has been hit exceptionally hard by the foreclosure crisis. Last week, the city of Highland Park, announced that it would completely remove 1,000 streetlights, since it can no foot its power bills, with the loss of 50 percent of its population over the past two decades. Meanwhile, Detroit's population decreased by 25 percent in the time between the 2000 and 2010 census.\n\nStill supporters of the 2009 auto-industry bailout say things could have been much worse.\n\n\"Look at where the auto industry was when [George W.] Bush was leaving, and President Obama was coming in to where it is now,\" former Democratic Rep. Mark Schauer said Tuesday. \"It is a night and day difference that occurred because of intentional policies.\"\n\nMeasures such as the auto bailout will no doubt draw much discussion at Wednesday's debate. The debate will focus principally on economic issues, meaning that all eight Republican candidates--including former Massachusetts Gov. Mitt Romney, who was born and raised in Michigan where his"} +{"output_text": " with Stagecoach, is due to expire in 2020. The government has said it will not renew the contract.\n\nWhy is the government not renewing the contract?\n\nThe government has said it will not renew the contract because it is not satisfied that the West Coast line is being run in the public interest. The government has said it will not renew the contract because it is not satisfied that the West Coast line is being run in the public interest.\n\nWhat is the West Coast", "input_text": "Analysis by the Guardian indicates that Virgin Rail Group Holdings, the joint venture company, will have collected at least \u00a3600m since its launch in 1997, a figure that drew criticism from Labour.\n\nThe final total is likely to be higher once this year\u2019s dividend is declared when the company\u2019s next set of annual accounts is published in October next year.\n\nBranson\u2019s Virgin Group owns 51% of the venture, giving him a \u00a3306m share of the overall dividend pot.\n\nThe remaining \u00a3294m was allocated to the Stagecoach transport group, whose largest shareholder is the Scottish businessman and Scottish National party donor Brian Souter, together with his sister, Ann Gloag.\n\nThe highest dividend in a single year was paid in 2009, when Virgin Rail Group paid out nearly \u00a395m. The figure has hovered around \u00a350m over the past three years.\n\nAndy McDonald, the shadow transport secretary, said: \u201cThis money could and should have been used to invest in services and hold fares down, not siphoned off by shareholders.\n\n\u201cThe railway should be run as a public service in public ownership. Instead, absurdly, its run in the financial interest of foreign state-owned companies and billionaires such as Richard Branson. If Virgin disappears from the railway as Branson warns, it won\u2019t be missed by taxpayers or passengers.\u201d\n\nA spokesman for Virgin Trains said: \u201cIn the last five years alone, Virgin Trains has paid almost a billion pounds (\u00a3970m) back to taxpayers through creating a highly successful business which taxpayers benefit from.\u201d\n\nQuick guide Why Virgin Trains won't be running Show Hide When is Virgin disappearing from UK railways?\n\nBy next spring at the latest, and possibly even earlier. The contract Virgin has for the West Coast line, which is in partnership"} +{"output_text": "yl, on the other hand, is a woman who has been through a lot. She has been through a lot of trauma, and she has been through a lot of trauma in the past year. She has been through a lot of trauma in the past year, and she has been through a lot of trauma in the past year. She has been through a lot of trauma in the past year, and she has been through a lot of trauma in the past year. She has been through a lot", "input_text": " women like this are gone within five episodes, and none that come after come anywhere near them in terms of breaking with the usual tropes and stereotypes.\n\nAt this point, it is an excellent time to bring up Meryl and Milly, our insurance \u201cgirls\u201d. Just the language used to refer to them is telling enough \u2013 they are \u201cgirls\u201d, not women. They are continuously in over their heads, trying to use doughnuts to cajole criminals they believe to be Vash, forever playing catch up. Ultimately, they are the running gag of the show; what would really change were they removed from the narrative? Less jokes about people disbelieving Vash\u2019s identity, less of an awareness that Vash really loves doughnuts\u2026 Not much else.\n\nPut another way: do you notice when Meryl and Milly aren\u2019t around? Do you care? Does anyone walk away from the show wishing we\u2019d spent more time with their antics?\n\nNo \u2013 because the two do not matter in the grand scheme of the show. Why is there a forced romance angle from Meryl toward Vash or Milly toward Wolfwood? What on earth does this add to the show? Maybe if it were even executed at all one could make an argument for the inclusion of both. But it feels more obligatory to have any hint of romance at all rather than organic, as if someone looked at the cast and said, well, there are two guys and two women, so obviously\u2026\n\nI take especial issue with the entire Milly and Wolfwood thing. I quite frankly find the idea that Milly does sleep with Wolfwood gross. Milly for the entirety of the show has been played as an innocent, as an overgrown child. She loves pudding and speaks and thinks in simplistic ways. She does not know her limits on alcohol. Mer"} +{"output_text": " the CIA\u2019s use of malware to hack into the computers of foreign governments and private companies, including the German government, the French oil company Total, the Russian energy company Gazprom, the Chinese telecommunications company Huawei, the Brazilian oil company Petrobras, the Mexican oil company Pemex, the Saudi Arabian oil company Aramco, the South Korean steel company POSCO, the Japanese electronics company Sony, the South Korean conglomerate Samsung, the Chinese tech company Lenovo, the Chinese", "input_text": "te trial, Australian Department of Foreign Affairs and Trade officials confirmed that it was possible that Assange would face additional counts carrying the death penalty if he was dispatched to the US. The timing of their statements, which contradict the previous claims of US allies, could indicate that there is much at stake for Assange in the attempted US prosecution of Schulte.\n\nThe failure of the jury to convict, after a \u201cnational security\u201d trial in which all advantages were slated to the prosecutors, underscores the criminal character of the media blackout of the proceedings, which began in late January. For over a month, the most prominent corporate media outlets have remained silent on court hearings which revealed aspects of the politically-motivated witch-hunt of WikiLeaks and its alleged sources.\n\nThe publication of Vault 7 in early 2017 was the trigger for a major escalation in the US government vendetta against Assange, culminating in his illegal expulsion from Ecuador\u2019s London embassy last year, his arrest by the British police and imprisonment in a maximum-security prison. Schulte\u2019s trial, moreover, coincided with the first week of the British extradition hearing against Assange, which underscored the similarities in the lawless treatment of the WikiLeaks publisher and his alleged CIA source.\n\nProsecutors have described the Vault 7 leak, which they accuse Schulte of being responsible for, as the largest in the entire history of the CIA. The disclosure has been compared to the releases of National Security Agency whistleblower Edward Snowden, who in 2013 exposed mass US government surveillance of the American and world population.\n\nVault 7 revealed that the CIA was conducting illegal spying operations, including through phones and household appliances such as smart televisions. It exposed the US government as one of the biggest purveyors of malicious computer viruses in the world.\n\nVault 7 documented"} +{"output_text": "00000000a845e57fcc55024711da2652b6956e9f72a252fe.\n\n\n\nThe optional: a is a way to pass in a value. It is a string that is the hex representation of a byte32. So if you want to pass in a string you would do:\n\n\n\nA.call(the_contract_signature_signature, \u201cstring\u201d, [optional: a, set, of, parameters])\n\n\n\nThe optional:", "input_text": " can call any other contract that may come online in the future.\n\n\n\nIt turns out there is a way to call code and contracts that you don\u2019t really know about from a contract. This is accomplished by using the CALL feature of solidity. The best documentation I could find on it is here under the members section, but that seemed to be a bit incomplete.\n\n\n\nIf you have an address A in your solidity contract and A is a contract, you can call one of its functions by doing:\n\n\n\nA.call(the_contract_signature_signature, [optional: a, set, of, parameters])\n\n\n\nThe stuff that isn\u2019t documented very well that found were the following:\n\n\n\nThe param the_contract_signature_signature looks like \u201cbytes4(sha3(\u201cMyFunction(address,uint256)\u201d))\u201d where the param items are pretty specific. I didn\u2019t explore the whole scope of value types, but I did find out that uint doesn\u2019t work\u2026you need uintXXX where XXX is the byte length. Also, if your function has no arguments it needs to be \u201cMyFunction()\u201d. What you end up with here is a 4 byte signature that points to the function in the contract. Apparently, to save space the full function name isn\u2019t used and they use the sha3 to guarantee uniqueness. Only taking the first 4 bytes probably could lead to some collisions, but maybe not. There is probably a function that calculates this. Have fun chasing down that bug if it ever happens.\n\n\n\nThe [a, set, of, parameters] is just a list of params where each one is the hex representation of byte32 padded on the left. So an address of 0xa845e57fcc55024711da2652b6956e9f72a252fe becomes 0x0000000000000000"} +{"output_text": "ivals in Austin, Texas, will feature a Smash Bros. tournament. The event is being organized by the Austin-based Smash Bros. Melee community, and will feature a $10,000 prize pool.\n\nThe tournament will be held at the Austin Convention Center, and will feature a $10,000 prize pool.\n\nThe tournament will feature a $10,000 prize pool.\n\nThe tournament will feature a $10,000 prize pool.\n\nThe tournament", "input_text": " these three persons the One God is shown, Each first in place, each last, not one alone. Of Brahma, Vishnu, Siva, each may be First, second, third among the Blessed Three\u2019\u2019.\n\nOn ancient Egypt, Newton quotes Professor Sayce (Gifford Lectures & Hibbert Lectures):\n\n\u2018The indebtedness of Christian theological theory to ancient Egyptian dogma is nowhere more striking than in the doctrine of the Trinity. The very same terms used of it by Christian theologians meet us again in the inscriptions and papyri of Egypt\u2019.\n\nNewton continues:\n\n\u2018And now we see some meaning in the strange phrases that have puzzled so many generations in the Nicene and Athanasian Creeds, such as \u2018Light of Light, Very God of Very God, Begotten not Made, Being of one Substance with the Father.\u2019 These are all understandable enough if translated into the language of the Solar Trinity [worshipped in ancient Egypt], but without this clue to their meaning, they become sheer nonsense or contradictions\u2026The simplicity and symmetry of the old sun Trinities were utterly lost in forming these new Christian Creeds on the old Pagan models\u2026The [pagan] trinities had all the prestige of a vast antiquity and universal adoption, and could not be ignored. The Gentile converts therefore eagerly accepted the Trinity compromise, and the Church baptized it. Now at length we know its origin\u2019.\n\nYes, parts of the Nicene and Athanasian Creeds were plagiarized from pagan religious texts \u2013 word for word, phrase for phrase!\n\nRead Next: Why is this Important? SXSW is running a high-profile Smash invitational \u00a9 SXSW\n\nThis weekend, \u201cBattle of the Five Gods\u201d at the South by Southwest Conference and Fest"} +{"output_text": " Reviewed / Jackson Ruckar)\n\nIf you're looking for a gift that will always be remembered, consider a subscription to a magazine. It's a great gift for anyone who loves to read, and it's a great way to get them to read something you think they'll enjoy.\n\nGet a Subscription to the New Yorker on Amazon for $12.99\n\n15. A new pair of headphones\n\nA new pair of headphones is a great gift for anyone who", "input_text": " there, the $5 Wet 'n Wild beat out the competition by a large margin (yes, even outperforming Kylie's ever-popular lip kits). It stayed on the longest, was easy to wash off, and didn\u2019t leave marks anywhere. Plus, at this price, you might as well get them several colors or buy one for each of your friends.\n\nGet Wet 'n Wild Liquid Lipstick on Amazon for $3.74\n\n12. An eye mask for anyone stressed this holiday season\n\nSweet dreams at any time of day. (Photo: Reviewed / Jackson Ruckar)\n\nAfter the holidays, we're all going to need some extra shut-eye. To make it easier to catch up on a few zzz's, gift your particularly stressed friend, coworker, or parents the best eye mask money can buy. The Nidra Deep Rest eye mask has an adjustable velcro strap that keeps the mask secure, is super comfortable, and will leave you well rested\u2014even if you're sleeping in the middle of the day.\n\nGet the Nidra Deep Rest Eye Mask on Amazon for $11.95\n\n13. Socks that everyone needs but never buys\n\nEveryone needs a new pair of socks. (Photo: Loritta)\n\nSocks make the best gifts ever. Case in point: How often do you need socks? Always. How often do you buy socks? Probably not enough. Anyone would be happy to receive fuzzy socks, wool socks, heck even gym socks as a gift. You know they'll definitely use them and they'll think of you every time they put them on.\n\nGet 5 Pairs of Wool Socks on Amazon for $18.99\n\n14. Their favorite magazine subscription\n\nYour recipient will always think of you when they open their mail. (Photo:"} +{"output_text": "tica\n\nO juiz S\u00e9rgio Moro, que comanda a Lava Jato, \u00e9 um dos principais respons\u00e1veis pela investiga\u00e7\u00e3o que levou \u00e0 pris\u00e3o do ex-presidente Luiz In\u00e1cio Lula da Silva.\n\nMoro \u00e9 um juiz de um caso s\u00f3, al\u00e9m de extremamente capaz, muito disciplinado e eficiente. \u00c9 por isso que ele tem sido t\u00e3o c\u00e9lere.\n\n", "input_text": " \u00e0 Petrobras. Retirou dele, por exemplo, o caso da usina de Belo Monte, embora este tenha sido aberto a partir de delatores da Lava Jato. Boa parte das linhas de investiga\u00e7\u00e3o partindo da Petrobras j\u00e1 foram abertas, 179 suspeitos foram acusados, com 93 condena\u00e7\u00f5es.\n\nAssim, Moro espera ter um n\u00famero significativamente menor de novas opera\u00e7\u00f5es sob seu comando a partir do pr\u00f3ximo ano, segundo tem dito em conversas privadas. O juiz n\u00e3o quis dar entrevista.\n\nIsso n\u00e3o significa que a Lava Jato vai se encerrar, mas sim que pode perder substancialmente intensidade e velocidade.\n\n\u201cMoro \u00e9 um juiz de um caso s\u00f3, al\u00e9m de extremamente capaz, muito disciplinado e eficiente\u201d, disse Floriano Azevedo Marques, professor de Direito na Universidade de S\u00e3o Paulo. \u201c\u00c9 por isso que ele tem sido t\u00e3o c\u00e9lere.\u201d\n\nA Pol\u00edcia Federal e o Minist\u00e9rio P\u00fablico dizem que a Lava Jato vai continuar independente de poss\u00edveis mudan\u00e7as governo. Nos c\u00edrculos jur\u00eddicos, no entanto, h\u00e1 uma sensa\u00e7\u00e3o de que, quando Moro concluir sua investiga\u00e7\u00e3o sobre a Petrobras \u2013 epicentro da Lava Jato \u2013 haver\u00e1 desacelera\u00e7\u00e3o. O ritmo do STF, que lida com casos desde impeachment a maioridade penal, \u00e9, por ess\u00eancia, mais lento.\n\nMarco Zero\n\nRecomendado para voc\u00ea Frequ\u00eancia Pol\u00ed"} +{"output_text": " the Giants' midfield depth and depth in the forward line a concern.\n\nThe Giants' midfield depth is a concern, with the club's midfield depth a concern.\n\nThe Giants' midfield depth is a concern, with the club's midfield depth a concern.\n\nThe Giants' midfield depth is a concern, with the club's midfield depth a concern.\n\nThe Giants' midfield depth is a concern, with the club's midfield depth a concern.\n\nThe Giants' midfield", "input_text": " rack the ball up for Gold Coast, with a substantial production dip likely for Ellis.\n\nFollowing the retirement of Tom Nicholls and Brayden Crossley's positive drug test, Smith is an addition purely to add depth behind Jarrod Witts.\n\nThe return Gold Coast received for Ah Chee, headlined by a 2020 second round pick, is a positive return given he has struggled to play a consistent brand of football in his four years with the Suns. Ah Chee is yet another former top-10 selection Gold Coast have been unsuccessful in developing.\n\nGreater Western Sydney\n\nIn: Sam Jacobs, pick No.6, pick No.40 (Tomlinson compensation), pick No.59, 2020 round three pick (North Melbourne)\n\nOut: Aiden Bonar, Jonathon Patton, Adam Tomlinson, pick No.12, pick No.18, 2020 round four selection\n\n2019 Draft Picks: 6, 40, 59, 60, 80, 94\n\nGrade: C\n\nRationale: The versatility of Tomlinson will be missed by the Giants, who were not adequately compensated given his effectiveness in his 73 games played over the past three seasons. Losing Patton for such a poor return is disappointing given how dominant Patton was in 2016 and 2017. While Aiden Bonar is not a required player long term as he does not project to be a best 22 player due to the sheer strength and depth of GWS' midfield, the return was disappointing for 2017's pick 11.\n\nGWS' trade for Sam Jacobs filled the club's most pressing need and was a must following the retirement of Dawson Simpson and query over Shane Mumford's future, with the Giants in win-now mode.\n\nGWS' trade up to pick 6 from picks 12 and 18 is a wise one, with"} +{"output_text": ".40 7 16.30 8 80 9 80 9 16.00 9 81 9 80 8 16.20 8 16.10 8 32.70 8 80 9 79 9 15.90 9 79 9 79 9 15.90 9 81 8 80 8 16.20 8 16.10 8 24.55 8 80.900 8 0 0.00 80.900 8 The Crusaders 80 10 80 10 16.00 10 82 8 81 8 16.20 8 16.10 10 79 11", "input_text": " 17.10 5 17.25 4 86 1 85 4 17.10 4 90 2 87 3 17.70 2 17.40 3 34.65 3 86 5 82 5 16.80 5 86 3 87 2 17.30 2 86 3 84 4 17.00 3 25.55 4 85 4 84 4 16.90 4 85 4 85 4 17.00 4 82 5 85 2 16.70 3 86 5 86 3 17.20 4 16.95 4 25.425 4 85.625 4 0 0.00 85.625 4 Santa Clara Vanguard 86 4 85 5 17.10 5 87 4 86 4 17.30 4 17.20 5 83 5 85 4 16.80 5 87 4 83 6 17.00 5 16.90 5 34.10 5 87 4 86 3 17.30 3 84 5 83 5 16.70 5 85 4 83 5 16.80 5 25.40 5 84 5 82 5 16.60 5 84 5 84 5 16.80 5 85 1 86 1 17.10 1 88 3 86 3 17.40 3 17.25 2 25.325 5 84.825 5 0 0.00 84.825 5 Blue Knights 84 6 83 6 16.70 6 85 6 84 6 16.90 6 16.80 6 82 7 82 7 16.40 8 84 7 84 5 16.80 6 16.60 6 33.40 6 82 7 81 6 16.30 6 79 9 78 9 15.70 10 80 9 79 8 15.90 8 23.95 8 82 7 80 7 16.20 7 83 6 82 6 16.50 6 81 6 80 6 16.10 6 84 7 83 6 16.70 7 16.40 6 24.55 6 81.900 6 0 0.00 81.900 6 The Cavaliers 81 8 81 8 16.20 8 85 6 83 7 16"} +{"output_text": ".\" g.author \"Federico Ramirez\" g.url \"http://blog.beezwax.net\" end generator.generate! 1 2 3 4 5 6 7 8 9 generator = EPUBGenerator do | g | g . title \"My Awesome Book\" g . description \"An awesome book, really.\" g . author \"Federico Ramirez\" g . url \"http://blog.beezwax.net\" end generator . generate !\n\nNow we", "input_text": "new(title: \"My Awesome Book\", description: \"An awesome book, really.\", author: \"Federico Ramirez\", url: \"http://blog.beezwax.net\") 1 2 generator = EPUBGenerator. new ( title : \"My Awesome Book\", description : \"An awesome book, really.\", author : \"Federico Ramirez\", url : \"http://blog.beezwax.net\" )\n\nYou might say \u201cMeh it\u2019s not that bad\u201d. And you would be right! But we are taking an unnecessary risk, four arguments for a method is a red flag \u2014 it can get out of hand quite easily.\n\nThere are many ways to solve that issue, the most common of which is to \u201cextract it into an object\u201d. Let\u2019s create a Book model. We just add the arguments as attributes, make sure the data is always consistent and just inject that object into our generator. Now our code is not only more solid and easier to maintain, but we have the added benefit of testability.\n\nNow we are done\u2026 well, not really. Consider now that our EPUB generation library is a Ruby gem. We\u2019ll force all our users to know all the class names: EPUBGenerator, Chapter and Book.\n\nIf the library is this small, it\u2019s not really a big deal. If we know we\u2019ll need to expose the user to more classes, then we might want to consider a better solution. This is where a DSL comes handy.\n\nA DSL gives us yet another layer of abstraction. In this example, with a single class name, the user can easily use the library to create a new EPUB:\n\ngenerator = EPUBGenerator do |g| g.title \"My Awesome Book\" g.description \"An awesome book, really"} +{"output_text": " a game, and I was determined to find it this time. I\u2019d been playing a lot of games lately, and I was getting a little burned out. I\u2019d been playing a lot of games lately, and I was getting a little burned out. I\u2019d been playing a lot of games lately, and I was getting a little burned out. I\u2019d been playing a lot of games lately, and I was getting a little burned out. I\u2019d been playing a lot", "input_text": " sea) that no sovereign sought to control. The spread of Buddhism from South Asia along the \u2018belt\u2019 and \u2018road\u2019 wove a common world of religious-cultural ambiance and sensibility that signified both integration and cosmopolitanism.\n\nStanding in the UNESCO offices in March 2014, Chinese President Xi Jinping extolled the profound impact of Buddhism on China. For his part, Modi, a firm devotee of the Buddha, commenced his recent China tour at a shrine built to commemorate a famous Chinese Buddhist monk who had visited his ancestral village in Gujarat during the Tang Dynasty era. A 21st century infrastructure project geared to connect the Asian heartland to its hinterland and beyond might yet revive a set of loose integrative norms, which can foster principles of order and self-restraint in East Asia and South Asia.\n\nModi, unrestricted by the blinkers of his elitist predecessors, should exercise his abundant leadership qualities to walk India and South Asia confidently down this path.\n\nSourabh Gupta is Senior Research Associate at Samuels International Associates, Inc., Washington, DC.\n\nThis article appeared in the most recent edition of the East Asia Forum Quarterly, \u2018Leadership in the region\u2018. One of the things that I believe is utterly vital to writing about games in general and MMOs in particular is finding the fun. It\u2019s sometimes difficult, but I think finding the fun is the difference between saying that a game is hot garbage (which it may be) and saying that it\u2019s not to your tastes. That\u2019s not to say the fun is even always there to be found, but if you can understand why someone might enjoy the game, you can at least work from common grounding.\n\nIt was something I hadn\u2019t managed the last time I played"} +{"output_text": " we can begin to heal.\n\nThe Mind will automatically start to create healthy and positive narratives to frame our traumatic experiences with. The Eye of Gratitude opens and we will start to perceive the wisdom hidden in our traumatic experiences. And as we start to feel gratitude for those particular lessons, our capacity for gratitude will also expand. We now have the tools to be able to confront our traumas, and so, we can begin to heal.\n\nThe Mind will automatically start to create healthy", "input_text": ". And when you dwell within that state, you will feel a sense of completion, a sense of home. A path opens up. The key word here is path, for it is not the end of the journey.\n\nThis is the beginning of wisdom, and it teaches us the Universal Principle of Wisdom upon which all other wisdom arises; that you are greater than this event, and this teaches us experientially the transcendent nature of the human spirit. This strengthens within us the concept of the enduring sense of self, to use Eriksonian terminology of the self, that we exist regardless of how someone, something, society through its expectations, whatever it might be, has attempted to devalue us. It teaches us true confidence, that no matter what happens, I am here and I know how to return home.\n\nThere arises from within a strong sense of gratitude after experiencing the state of Pure Consciousness. When we realize that it is coming from within then we realize that gratitude is inherent to us. This positive emotion and state is inherent to us. And by dwelling within that state with mindfulness, we can grow our power of gratitude and condition our bodies in that direction to let go of its stored trauma. We will be able to breath easier, stand up straighter, and walk with confidence and strength. And when it becomes internalized then it will form the basis of our sense of Self. Our paradigm will start to shift in a way that reflects this state of Gratitude.\n\nThe Mind will automatically start to create healthy and positive narratives to frame our traumatic experiences with. The Eye of Gratitude opens and we will start to perceive the wisdom hidden in our traumatic experiences. And as we start to feel gratitude for those particular lessons, our capacity for gratitude will also expand. We now have the tools to be able to confront our traumas, and so,"} +{"output_text": " in the $200M range for these types of films, and that\u2019s what Warner Bros. is banking on.\n\nThe Meg is a Warner Bros. release, and the studio is hoping to replicate the success of the first film, which was a surprise hit in China. The first film was a surprise hit in China, and the studio is hoping to replicate that success with the sequel. The first film was a surprise hit in China, and the studio is hoping to replicate that success with", "input_text": " and aaahs of the shared experience.\u201d\n\nGiven how high Warner\u2019s has opened Meg here off a B+ CinemaScore, they can certainly get the movie past $100M stateside. The grade carries a 3.2 multiple, which puts Meg at around $144M. But Warners has propelled these cinematic Asian exports like Pacific Rim ($37.2M opening, $101.8M domestic) and Rampage ($35.7M, with a domestic that we\u2019re told will ultimately cross $100M) to lofty heights before in the U.S./Canada. Meg\u2019s over-indexing bodes quite well for this year\u2019s August box office, which collapsed last summer, delivering a 20-year low of around $657.7M. ComScore reports that weekend ticket sales are at $146.7M, up a huge 25% over the same period a year ago. Annual B.O. to date is currently counting $7.79 billion, 8.4% ahead of the 2017 period of Jan. 1-Aug. 12. Those under 18 (15% on CinemaScore) loved The Meg the most with an A-. 52% males and 48% females gave it a B+, with those over 25 giving it a B.\n\nProfitability is a whole other topic when it comes to Meg, and we\u2019ll assess that in the days to come. Warners says this Sino-Foreign co-production cost $130M net, while we\u2019ve heard from people who have knowledge of the production cost that it\u2019s $178M. The latter number isn\u2019t out of the realm for what these types of Chinese co-productions cost; read The Great Wall ($150M), Warcraft ($160M), etc. We examined in our last post how global profit lies"} +{"output_text": "; in Berlin the Greens have been the only party to win a majority in the city\u2019s state parliament.\n\nSecond, social democrats need to be more like the Greens. The Greens are a party of the left, but they are also a party of the centre. They have a strong record of coalition-building, and they have a strong record of winning elections. They are not a party of the left, but they are a party of the centre.\n\nThird, social democr", "input_text": " own sons, was Germany\u2019s centre-right chancellor in the 1980s. On March 13th, as voters south of the city (in high-tech, environmentalist Baden-W\u00fcrttemberg) and north-east of it (formerly communist Saxony Anhalt) abandoned the party, the stolid voters of Ludwigshafen remained loyal.\n\nYet at a pre-election rally for Malu Dreyer, a brassy, witty local leader who stands out against her lacklustre peers, the mood was remarkably flat. Ms Dreyer hailed once-social-democratic goodies that all now favour: child care, low unemployment, vocational training (\u201cWe want Meisters [foremen] as well as Masters\u201d). A marching band played a foxtrot and \u201cMack the Knife\u201d for supporters whose average age must have been 60. On the walls were posters with unobjectionable slogans: \u201cResponsibility\u201d, \u201cStaying Together\u201d. \u201cJust Right For Our Time\u201d, read one\u2014but the time was the time of the grandparent.\n\nOn their current trajectory, social democrats may well end up like liberals and greens today: subordinate players confined to regional strongholds whose best chance of influence is to nudge other parties in their direction should they get into coalition. But there are still some who are both in power and relatively popular. Their successes offer three lessons.\n\nFirst, renewal ends with national government; it does not begin there. Mayoralties and regional governments hone precisely the mix of pragmatism and innovative policy thinking that social democrats need if they are to win nationally. In Manchester a dynamic leadership with a \u201cwhat works\u201d credo keeps Labour dominant in an increasingly globalised city; in Hamburg the SPD parties like it\u2019s 1969 thanks to a resilient coalition of low- and middle-earners"} +{"output_text": ", the Saudi mission was not a focus of the FBI's counterterrorism efforts, the former official said.\n\nThe Saudi mission was not a focus of the FBI's counterterrorism efforts, the former official said.\n\nThe FBI's counterterrorism efforts were focused on the Saudi government, which was then led by King Abdullah. The Saudi government was a major source of funding for al Qaeda and other terrorist groups, and the Saudi government was also a major source of funding for the 9/11 hij", "input_text": "orah case, they insisted privately that there is little more they can do. Nor is the Trump administration considering any complaint or sanction against the Saudi government for its role in abetting the flight of Noorah and other Saudi fugitives, officials said.\n\nThe FBI and the Justice Department declined to comment for this article, saying they could not discuss matters under investigation.\n\nIn 2008, intelligence analysts noticed a series of calls to an ICE office from a Saudi official in Washington\n\nIn 2008, intelligence analysts at the headquarters of Immigration and Customs Enforcement spotted a striking trend, national security officials recalled. The analysts, who monitored potential terrorist and criminal threats related to foreign students, noted a series of calls to an ICE office from a Saudi official in Washington.\n\nThe official worked at the Saudi Arabian Cultural Mission, a branch of the country's diplomatic operation that was then located in a building near the Watergate office complex. He called periodically to ask about the visa status of various Saudi students. On further examination, the analysts found that some of the students had been charged with crimes including rape, embezzlement, and theft, and that they had unlawfully fled the country, a former senior national security official said.\n\nThe J. Edgar Hoover building, the FBI's Washington, DC headquarters. FBI The mission, established in Washington in 1951, administers government scholarships overseas and helps prepare the visiting students for life in a culture very different from their own. If they ever run into trouble, they are instructed to call the mission. (If they did not, they could expect Saudi officials to call them.) Based on reporting by the FBI, US intelligence analysts believed the cultural center also served as a base for undercover Saudi intelligence officers who kept tabs on the growing numbers of students in the United States.\n\nWhile Saudi diplomatic activity was monitored by the FBI as part of its routine counterintelligence efforts"} +{"output_text": "t come up, it\u2019s not going to be a deal-breaker,\u201d said Rep. Jackie Speier Karen (Jackie) Lorraine Jacqueline SpeierOvernight Defense: House to vote on military justice bill spurred by Vanessa Guill\u00e9n death | Biden courts veterans after Trump's military controversies House to vote on 'I Am Vanessa Guill\u00e9n' bill Overnight Defense: Trump's battle with Pentagon poses risks in November | Lawmakers launch Fort Hood probe | Military", "input_text": " first half to the Warriors' none.... James' first quarter: 4 for 4 with a 3-pointer, 12 points, three assists, a steal and turnover.\n\nWarriors: Curry's early three-point play gave him the free throw he needed for 379 to pass Rick Barry (378) for first place on the Warriors' career list.... With his 293rd career 3 in the playoffs, made in the second quarter, Thompson passed Kobe Bryant (292) for sixth place on the NBA's list for postseason 3s.... Kevon Looney started again in place of Iguodala, but coach Steve Kerr went to McGee after the break.... Home run king Barry Bonds was in attendance.\n\n---\n\nMore AP NBA: https://apnews.com/tag/NBAbasketball House lawmakers from both parties want any candidate running for Speaker to promise to push for an overhaul of Capitol Hill\u2019s sexual harassment policies.\n\nWhile there is no organized effort to demand that any Speaker hopeful make a pledge to such reforms, a number of House members told The Hill that a candidate\u2019s stance on the issue will be one of the criteria they use to decide whether to back someone vying for the Speaker\u2019s gavel.\n\nSome items on their wish list, such as prohibiting the use of taxpayer dollars to settle sexual harassment claims, will likely find easy support with any Democrat or Republican seeking the top leadership job. But more contentious changes, including publicly revealing the names of lawmakers who have settled claims, may be a tougher ask, since the idea faces some resistance in both parties.\n\nADVERTISEMENT\n\nStill, the heightened attention on the issue \u2014 and the fact it could influence a leadership race \u2014 underscores the growing support for revamping Capitol Hill\u2019s sexual harassment policies following the national \u201cMe Too\u201d movement.\n\n\n\n\u201cIf the issue doesn\u2019"} +{"output_text": " to such content should be cautious.\n\nLimitations This study has several limitations. First, the data are limited to Twitter and may not be representative of other social media platforms. Second, the data are limited to accounts that have been active for at least 6 months. Third, the data are limited to accounts that have been active for at least 6 months. Fourth, the data are limited to accounts that have been active for at least 6 months. Fifth, the data are limited to accounts that have", "input_text": ". Presumably, accounts that are obviously automated are more frequently used to disseminate content such as news and may not be considered credible sources of grassroots antivaccine information. Public Health Implications Survey data show a general consensus regarding the efficacy of vaccines in the general population.35 Consistent with these results, accounts unlikely to be bots are significantly less likely to promote polarized and antivaccine content. Nevertheless, bots and trolls are actively involved in the online public health discourse, skewing discussions about vaccination. This is vital knowledge for risk communicators, especially considering that neither members of the public nor algorithmic approaches may be able to easily identify bots, trolls, or cyborgs. Malicious online behavior varies by account type. Russian trolls and sophisticated bots promote both pro- and antivaccination narratives. This behavior is consistent with a strategy of promoting political discord. Bots and trolls frequently retweet or modify content from human users. Thus, well-intentioned posts containing provaccine content may have the unintended effect of \u201cfeeding the trolls,\u201d giving the false impression of legitimacy to both sides, especially if this content directly engages with the antivaccination discourse. Presuming bot and troll accounts seek to generate roughly equal numbers of tweets for both sides, limiting access to provaccine content could potentially also reduce the incentive to post antivaccine content. By contrast, accounts that are known to distribute malware and commercial content are more likely to promote antivaccination messages, suggesting that antivaccine advocates may use preexisting infrastructures of bot networks to promote their agenda. These accounts may also use the compelling nature of antivaccine content as clickbait to drive up advertising revenue and expose users to malware. When faced with such content, public health communications officials may consider emphasizing that the credibility of the source is dubious and that users exposed"} +{"output_text": "omore\n\nForward\n\nBentil is a good athlete who can play multiple positions. He's a good shooter and a good rebounder. He's a good defender, too.\n\nThe Raptors have a lot of needs, but Bentil is a good fit for them. He's a good fit for the modern NBA, and he's a good fit for the Raptors.\n\n27. New York Knicks (via Cavaliers)\n\nMiles Bridges", "input_text": " he has a shot at going here.\n\nHe's not only a good athlete who can defend multiple positions, but he's also a terrific passer and an emerging shooter.\n\nCheick Diallo\n\nKansas\n\nFreshman\n\nForward\n\nDiallo didn't really do much at Kansas this past season, but that won't stop a team like the Raptors from grabbing him, especially after his strong play at the combine.\n\nHe is tough and athletic, and he might have the best motor in the draft. He's very, very raw. But had he stayed in school another year, he probably would have been a lottery pick in 2017. That makes him a good value here.\n\n28. Phoenix Suns (via Cavaliers)\n\nJuan Hernangomez\n\nSpain\n\nAge: 20\n\nForward\n\nThe Suns have three picks in this draft. They won't want to keep all three in the NBA next season.\n\nHernangomez is intriguing as a stretch-4 who played significant minutes in the ACB this year. The Suns can stash him for a year, and he ultimately could really help them down the road. Big guys with solid athleticism and shooting ability don't come along every day.\n\nChinanu Onuaku\n\nLouisville\n\nSophomore\n\nCenter\n\nThe Spurs got killed in the paint against the Thunder in the playoffs and will certainly be looking for some big men to help in the long term.\n\nOnuaku might be raw offensively, but he could bring a lot of the same toughness the Spurs saw in the Thunder's Steven Adams. Onuaku is a good rebounder and shot-blocker who plays with great toughness.\n\nBen Bentil\n\nProvidence\n\nSoph"} +{"output_text": ") button.\n\nTo load only the item of a save state, hold Reload when activating the white Load Save State (Load) button.\n\nTo load the item of a save state and the item of another save state, hold Reload when activating the white Load Save State (Load) button.\n\nTo load the item of a save state and the item of another save state, hold Reload when activating the white Load Save State (Load) button.\n\nTo load", "input_text": " Crouch Use the secondary function of a station Crouch while holding Reload Respawn (when dead) Hold Interact for 1 second Hold Melee for 1 second Go to next item in inventory Melee Equip selected inventory item Communicate: Acknowledge while at full health and out of combat\u00b9 Communicate: Yes while at full health and out of combat\u00b9 Sell selected inventory item for gold (default selling method selected) Interact Sell selected inventory item for gold (alternative selling method selected) Melee while holding Crouch Toggle between default/alternative selling method Interact while standing inside the white station in town Melee while standing inside the white station in town Teleport to spawn room Communicate: Need healing outside of town and out of combat\u00b9 Show cost of services in town Communicate: Thanks Communicate: Need healing while in town Use Soul Stone\u00b2 Communicate: Group Up outside of town Communicate: Press the Attack or Communicate: Push Forward Show stats of equipped item Communicate: Ultimate Status Communicate: Ready Show stats of selected inventory item Communicate: Ultimate Status while holding Crouch Communicate: Ready while holding Crouch Use the Teleport special ability\u00b3 Communicate: Hello Communicate: Goodbye or Reload + Interact Gift selected inventory item to player you are facing Melee while holding Reload \u00b9 Players are out of combat when they have not taken or dealt any damage for 3 seconds (their health will start to regenerate until they enter combat again). \u00b2 Only one player can have a Soul Stone active at the same time. Cannot be used when there is already a boss in the game. \u00b3 The player needs to have an item with that special ability equipped.\n\n2.2 Entering Save Codes\n\nTo load only the item of a save state, hold Reload when activating the white Load Save State (Load"} +{"output_text": "aky if he was messing with nature. And that's what I was going for. I was trying to make it as creepy as possible. I was trying to make it as freaky as possible. And I think I succeeded. I think I succeeded. I think I succeeded. I think I succeeded. I think I succeeded. I think I succeeded. I think I succeeded. I think I succeeded. I think I succeeded. I think I succeeded. I think I succeeded. I think I", "input_text": " this long I have a team that's got my back. A really, really good team. Obviously Vijaya has been my partner since before my first issue. Vijaya and I actually came up with the plan together for Cartoon Books, to do a black and white comic book every two months and each one will be a chapter in this larger novel. Vijaya knew the ending of the story before we even started. I've got Kathleen, who I mentioned earlier, Kathleen Glosan, who's helped Vijaya run the office. She's the contact with most of the outside world. She sets up publicity, takes care of things, makes sure what I'm supposed to be doing. I've got Steve Hamaker, who does not only the color onbut about any art-related job that needs to be done at Cartoon Books that isn't actual comic book pages. I still have to do those by myself. And we have Tom Gaadt, who does all the web stuff. He handles the store, and goes on the road. Everyone's been with us for years. We've had a good group for a long time.RASLI start out thinking I know what the ending is and where I'm going. And I do. Mostly. The ending will be the same ending. But as you write, especially a serialized book, which is what comic books do really well, I think, the story grows. You get ideas as you're going along. In, there's that spooky little girl that doesn't talk. She was not in my original plan. She just kind of popped onto the page one time. I was suggesting that Rasl, by going to different universes and traveling back and forth, was altering something. He was messing with nature. It popped into my head that it would be super-creepy and really fre"} +{"output_text": " array is not guaranteed to be zero-initialized } // precondition: array is not null public static List of(T[] array) { // precondition: array is not null return new List(array); } // postcondition: array is not null public static List of(T[] array) { // postcondition: array is not null return new List(array); } // precondition: array is not null public static List { TState iteratorStart { get } T? tryGetNext(ref TState state) } // Provides an iterator for all classes implementing IList extends IList with Iterator { uintz iteratorStart => 0 T? tryGetNext(ref uintz index) => if index < length some(this[index]) else none } // Allow to use a trait directly by inheritance class MyClassWithIterator : Iterator { }\n\nbuiltin syntax for tuple (valuetypes) with deconstructors let a = (1, \"value\") let x = a.0 // get 1 let str = a.1 // get \"value\" let (y, str2) = a // deconstruct tuple into variables\n\nenum sum-type /discriminated unions with the cool pattern matching // The type behind T? enum Option { none, some(T) }\n\nTry to fix The Billion Dollar Mistake by having a null safe language process(string name, string? valueMayBe) { // name is not null // valueMayBe may not have a value if let val?= valueMayBe { list.add(name, val) } else { //... } }\n\nmodules/namespaces, global functions, export type aliases // By default, namespace will be deduced from folder hierarchy // So most files in a project won't have to declare any namespaces namespace myNamespace::mySubNamespace { public let x = 5 public int32 increment(int32 val) => val++ }\n\nbuiltin contracts : for pre and post conditions via requires and ensures public virtual class List { unsafe T[] array; // unsafe because the"} +{"output_text": " became a business. In the early days of the Internet, it was a matter of concern for governments and corporations, who worried that the new medium would be used to undermine their authority. In the 1990s, the Clinton administration was concerned that the Internet would be used to undermine the government\u2019s authority. In the 2000s, the Bush administration was concerned that the Internet would be used to undermine the government\u2019s authority. In the 2010s, the Obama administration was concerned that the Internet would be", "input_text": " proportions. The social-media influencer has an eerie double in the hacker who covertly shapes political discourse. Both flourish in our increasingly networked world, in which digital influence is sharply double-edged\u2014a salable commodity and a threat to democracy, a commercial dream and a political nightmare.\n\nConnectivity is the basis for the heightened role that influence now plays in our lives. Digital technologies soften the borders between people and create a porousness upon which influence depends. In a fairly undisguised etymology, the word \u201cinfluence\u201d comes from the Latin for \u201cinflow,\u201d which provides an image of the way that, every second, our thoughts now stream into one another\u2019s pockets. The same image evokes our anxieties about hostile foreign states penetrating our defenses. Influence is a challenge to sovereignty, both political and personal; to admit to being influenced is to give up the attractive idea that, as individuals or societies, we are entirely self-contained.\n\nThe elusive quality of influence\u2014the difficulty we encounter when we try to identify its sources or measure its effects\u2014is equally destabilizing. Influence works best when it\u2019s wielded obscurely, in the shadows and behind the scenes, and this has clear social consequences for a society engaged in building a digital-influence economy. Based on the available evidence, it seems that we can\u2019t construct an influence economy without stoking a culture of skepticism and paranoia.The fear of being influenced affects our sense of reality and our ability to trust our own judgments about what is true. Election hackers and commercial influencers have wildly different aims, but both contribute to the unreal, distrustful tenor of our times, in which a language of fakery, deception, and inauthenticity has become fundamental to how we interpret the world.\n\nInfluence was worrisome long before it"} +{"output_text": " the balcony.\n\nShe was later seen in Rosewood, where she was seen with Toby and Jenna. She was seen with Toby and Jenna in the woods, where she was seen with a gun. She was seen with Toby and Jenna in the woods, where she was seen with a gun. She was seen with Toby and Jenna in the woods, where she was seen with a gun. She was seen with Toby and Jenna in the woods,", "input_text": " asked about her southern accent, she said she grew up in South Carolina and recently moved from New York to Rosewood to live with her aunt (\"Pretty Dirty Secrets\"). It was later revealed that she was actually from Georgia and lived next door to Alison's grandmother. She described her parents as being strict, and her aunt as being more \"lax\" about things. She briefly dated Paige during the summer and was shown to be very close with Jenna.\n\nIt was also revealed by Mona that Jenna and Shana knew each other before she came to town, and Shana might be in love with Jenna. Shana was a member of The Alliance, unofficially known as the B-Team. Her intentions were unknown at the time. Shana also told Spencer that Jenna is scared of CeCe Drake.\n\nShe was known to have many ties with Ravenswood too. She was seen by Spencer in that town when she and Toby were investigating in \"Under The Gun.\" She hastily got into Jenna's car and they sped off, meaning Mona or Jenna could have been driving. It was later revealed that Shana was helping Alison by staying close to Jenna to make sure she wasn't the one after Ali.\n\nShe fell in love with Jenna and decided to get justice for her by trying to kill Ali. She tracked down Ali in New York City and tried to shoot Ali, but Ezra got in the way and he was shot. She went to the hospital where Ezra was at but left as he woke up. She found Ali and The Liars (minus Aria) at the Fitzgerald theater and attempted to kill her again, and revealed that she set fire to lodge in order to kill the girls to get revenge. Before she could shoot, Aria arrived and hit her with a gun, sending her off"} +{"output_text": ", which includes the state\u2019s public schools, charter schools, and the state\u2019s voucher program. [Emphasis added.]\n\nIn other words, the Louisiana State Board of Education was effectively bought by the state\u2019s billionaire governor, and the state\u2019s billionaire governor was installed as the state\u2019s education superintendent.\n\nIn the 2012 election, Jindal was re-elected; BESE was effectively purchased, and in January 2012, John White was re-installed as Louisiana", "input_text": " State Board of Ed.\n\nPatricia\n\nAnd so, it came to pass:\n\nLast fall (in 2011), a coterie of extremely wealthy billionaires, among them New York City Mayor Michael Bloomberg, turned the races for unpaid positions on the Louisiana Board of Elementary and Secondary Education (BESE) into some of the most expensive in the state\u2019s history. Seven pro-education \u201creform\u201d candidates for the BESE outraised eight candidates endorsed by the teacher\u2019s unions by $2,386,768 to $199,878, a ratio of nearly twelve to one. In just one of these races, the executive director of Teach for America Greater New Orleans-Louisiana Delta, Kira Orange Jones, outspent attorney Louella Givens, who was endorsed by the state\u2019s main teacher\u2019s unions, by more than thirty-four to one: $472,382 to $13,815. [Emphasis added.]\n\nBy November 2011, Jindal was re-elected; BESE was effectively purchased, and in January 2011, John White was installed as Louisiana State Superintendent of Education:\n\nWhy would out-of-state billionaires care about Louisiana\u2019s Board of Elementary and Secondary Education? The state board must approve the governor\u2019s nominee for the powerful state superintendent of education by a two-thirds majority, and the 2007\u201311 board would have been unlikely to approve Jindal\u2019s nominee, John White. White had been in Louisiana for less than a year at the time, after coming from New York City to head Louisiana\u2019s Recovery School District, which the BESE directly supervises. A Teach for America alum, White had previously spent five years working as a deputy chancellor for the New York City Department of Education under Michael Bloomberg. Louisiana\u2019s education superintendent administers the state\u2019s educational system"} +{"output_text": " version of the Great Depression is coming.\n\nAnd that's just the housing market.\n\nThe real estate market is just the tip of the iceberg.\n\nThe credit crisis is just the tip of the iceberg.\n\nThe global credit crisis is just the tip of the iceberg.\n\nThe global credit crisis is just the tip of the iceberg.\n\nThe global credit crisis is just the tip of the iceberg.\n\nThe global credit crisis is just", "input_text": " for a market bottom (below Dow 1500).\n\nAnother unforeseen, massive decline in the U.S. stock market (and worldwide) will devastate market psychology, and that will quickly affect pretty much everything economic -- from consumer confidence to consumer spending to corporate earnings to employment, clear through to the housing market. Lenders will pull in their horns even further and few will be in the mood to borrow. Deflationary forces will be fully in play and those forces will continue to feed on each other as the value of virtually every asset class heads down yet again. Those not ready for this reversal will wonder how in heaven's name we went from placid waters to perfect storm in such a short period of time.\n\nMeanwhile, there's substantially more pain to come in real estate. As the Realtors proclaim on their TV commercials, \"Every market's different,\" which is their slimy way of saying, \"Sure, prices are falling everywhere else; just not in your neighborhood!\"\n\nOh, those wacky Realtors.\n\nOn the housing front -- and despite the fact we've seen positive price activity in scattered markets in the U.S. in recent months -- values are about to begin another descent as the global credit crisis/implosion enters its hair-raising stage, the stock market gets bludgeoned and another fifteen million foreclosures (including the existing, massive shadow inventory and what will become a growing wave of \"strategic defaults\") are brought to market over the next few years. When home values are cut in half after a 75 year run, is it possible they can be cut in half again? With ever-greater numbers of foreclosures, an even more intense economic recession and what will no doubt be eroding demand feeding the deflationary monster, bank on that rate of decline and more.\n\nReal estate's own"} +{"output_text": " cinema. Il sesto elemento fu il primo film di Besson che usc\u00ec in America. Il film fu proiettato in cinque diverse lingue, in particolare in inglese, spagnolo, francese e italiano.\n\nIl sesto elemento fu il primo film di Besson che usc\u00ec in America. Il film fu proiettato in cinque diverse lingue, in particolare in ingles", "input_text": "o, Moebius e Alejandro Jodorowsky avrebbero fatto causa a Besson per aver saccheggiato materiale dal loro fumetto Incal. La causa fu respinta ma fa capire quanto, in effetti, il film viaggi al confine tra omaggio spinto e plagio.\n\nEffetti speciali. Il quinto elemento fu anche un film importante nella storia degli effetti speciali. Il team di Besson combin\u00f2 miniature e CGI (fornita dalla Digital Domain) per creare un mondo colorato e spettacolare, complesso e caotico. Non si era mai visto nulla come la New York del 2263 in cui si svolge buona parte del film. Nel 1997 quegli effetti erano rivoluzionari. Oggi ovviamente si pu\u00f2 fare ben di pi\u00f9, ma il film va rivisto nell'ottica di vent'anni fa, per capire l'evoluzione degli effetti speciali al cinema.\n\n\n\nIl quinto elemento fu anche un film importante nella storia degli effetti speciali. Il team di Besson combin\u00f2 miniature e CGI (fornita dalla Digital Domain) per creare un mondo colorato e spettacolare, complesso e caotico. Non si era mai visto nulla come la New York del 2263 in cui si svolge buona parte del film. Nel 1997 quegli effetti erano rivoluzionari. Oggi ovviamente si pu\u00f2 fare ben di pi\u00f9, ma il film va rivisto nell'ottica di vent'anni fa, per capire l'evoluzione degli effetti speciali al"} +{"output_text": " important detail, you always want to be clear.\n\n8. \u2013 The camera is always in the right place. It\u2019s always in the right place. It\u2019s always in the right place. It\u2019s always in the right place. It\u2019s always in the right place. It\u2019s always in the right place. It\u2019s always in the right place. It\u2019s always in the right place. It\u2019s always in the right place. It\u2019s always", "input_text": " it aligns itself to a two-point perspective, as well as for the motion to come). But the movement is simply a physical one.\n\n5. \u2013 Minimal movement but there is a slight dolly in on when Antoine grabs the money out off the table. These are the little things you do when you want to underline something. It\u2019s a very slight movement, but it\u2019s there nonetheless. It goes in a bit tighter because we need to see what he\u2019s grabbing. If that shot stayed in its original, slightly wider position \u2013 the impact wouldn\u2019t be the same. Afterall, those money are a detail and you always want to underline a detail. Hence, without cutting \u2013 you\u2019re linking once again, shot lengths.\n\n6. \u2013 Is a nice blend of geography, and subjective shot. When Antoine walks into the room and it cuts to this, you have the camera slightly dollying and panning as if to show you the eyes and the physical movement of Antoine, before the shot goes from subjective (Antoine\u2019s eyes, and ours as an audience) to an objective one with him walking into a frame. Two birds with one stone. Go you, Truffaut and Decae! Filmmaking at its finest.\n\n7. The camera acts to show geography. When Antoine goes to pickup his folder \u2013 the camera moves based on the idea that it wants to actually show us what he picked up. You always want to be clear with your shots. Just imagine how lazy it would have been if \u2013 after the pan from the table on to the hallway, you didn\u2019t slightly dolly to the left like in the film. You\u2019d immediately have the folder obscured. You want to always be clear with the audience, no matter how small it is. Unless you\u2019re leaving out some"} +{"output_text": ".org.my ; echo debian.org ; echo freenetproject.org ; } | parallel -k traceroute\n\nwill display the traceroute invocation that finishes last.\n\nParallel can also be used to run a command in parallel on multiple files. For instance, the following command\n\nfind . -type f -name '*.txt' -print0 | parallel -0 -j+0 \"cat {}\"\n\nwill run cat on all files in the current", "input_text": " parallel xargs by adding the -X option:\n\nfind. -depth 1 \\! -name '.*' -print0 | parallel -0 -X mv {} destdir\n\nGNU Parallel can replace certain common idioms that operate on lines read from a file (in this case, filenames listed one per line):\n\nwhile IFS= read -r x; do do-something1 \"$x\" \"config-$x\" do-something2 < \"$x\" done < file | process-output\n\nwith a more compact syntax reminiscent of lambdas:\n\ncat list | parallel \"do-something1 {} config-{} ; do-something2 < {}\" | process-output\n\nParallel provides a built-in mechanism to remove filename extensions, which lends itself to batch file transformations or renaming:\n\nls *.gz | parallel -j+0 \"zcat {} | bzip2 >{.}.bz2 && rm {}\"\n\nThis will recompress all files in the current directory with names ending in.gz using bzip2, running one job per CPU (-j+0) in parallel. (We use ls for brevity here; using find as above is more robust in the face of filenames containing unexpected characters.) Parallel can take arguments from the command line; the above can also be written as\n\nparallel \"zcat {} | bzip2 >{.}.bz2 && rm {}\" ::: *.gz\n\nIf a command generates output, you may want to preserve the input order in the output. For instance, the following command\n\n{ echo foss.org.my ; echo debian.org ; echo freenetproject.org ; } | parallel traceroute\n\nwill display as output the traceroute invocation that finishes first. Adding the -k option\n\n{ echo foss"} +{"output_text": " \u03c0\u03bf\u03c5 \u03ba\u03c5\u03ba\u03bb\u03bf\u03c6\u03bf\u03c1\u03bf\u03cd\u03bd \u03c3\u03c4\u03bf\u03bd \u03ba\u03cc\u03c3\u03bc\u03bf.\n\n\u0397 \u03b1\u03c3\u03c4\u03c5\u03bd\u03bf\u03bc\u03af\u03b1 \u03bc\u03b1\u03c2 \u03ad\u03b4\u03c9\u03c3\u03b5 \u03c4\u03b7\u03bd \u03b5\u03c5\u03ba\u03b1\u03b9\u03c1\u03af\u03b1 \u03bd\u03b1 \u03b1\u03bd\u03b1\u03ba\u03b1\u03bb\u03cd\u03c8\u03bf\u03c5\u03bc\u03b5 \u03c4\u03bf\u03bd \u03c0\u03cc\u03bb\u03b5\u03bc\u03bf \u03c0\u03bf\u03c5 \u03b4\u03b9\u03ad\u03c0\u03c1\u03b1\u03be\u03b5 \u03b7 \u03ba\u03bf\u03b9\u03bd\u03c9\u03bd\u03af\u03b1 \u03bc\u03b1\u03c2. \u0397 \u03b1\u03c3\u03c4\u03c5\u03bd\u03bf\u03bc\u03af\u03b1 \u03bc\u03b1\u03c2 \u03ad\u03b4\u03c9\u03c3\u03b5 \u03c4\u03b7\u03bd \u03b5\u03c5\u03ba\u03b1\u03b9\u03c1\u03af\u03b1 \u03bd\u03b1 \u03b1\u03bd\u03b1\u03ba\u03b1\u03bb\u03cd\u03c8", "input_text": "\u03bd\u03ae \u03b1\u03b4\u03b9\u03ba\u03af\u03b1 \u03c4\u03c9\u03bd \u03ba\u03bf\u03b9\u03bd\u03c9\u03bd\u03b9\u03ba\u03ce\u03bd \u03b1\u03bd\u03b9\u03c3\u03bf\u03c4\u03ae\u03c4\u03c9\u03bd. \u0395\u03be\u03ac\u03bb\u03bb\u03bf\u03c5 \u03b3\u03b9\u03b1 \u03b5\u03bc\u03ac\u03c2 \u03c0\u03bf\u03c5 \u03cc\u03bb\u03b7 \u03c4\u03b7\u03bd \u03b7\u03bc\u03ad\u03c1\u03b1 \u03b1\u03bb\u03b7\u03c4\u03b5\u03cd\u03b1\u03bc\u03b5 \u03c3\u03b5 \u03c0\u03ac\u03c1\u03ba\u03b1 \u03ba\u03b1\u03b9 \u03c0\u03bb\u03b1\u03c4\u03b5\u03af\u03b5\u03c2 \u03b4\u03b5\u03bd \u03ae\u03c4\u03b1\u03bd \u03ba\u03b1\u03b9 \u03b4\u03cd\u03c3\u03ba\u03bf\u03bb\u03bf \u03bd\u03b1 \u03b1\u03bd\u03c4\u03b9\u03c0\u03b1\u03b8\u03ae\u03c3\u03bf\u03c5\u03bc\u03b5 \u03c4\u03b7\u03bd \u03b1\u03c3\u03c4\u03c5\u03bd\u03bf\u03bc\u03af\u03b1 \u03b1\u03ba\u03cc\u03bc\u03b1 \u03ba\u03b1\u03b9 \u03b1\u03c0\u03cc \u03ad\u03bd\u03c3\u03c4\u03b9\u03ba\u03c4\u03bf \u03bc\u03c0\u03bf\u03c1\u03bf\u03cd\u03bc\u03b5 \u03bd\u03b1 \u03c0\u03bf\u03cd\u03bc\u03b5. \u0395\u03af\u03c7\u03b1\u03bc\u03b5 \u03b4\u03b5\u03b9 \u03c4\u03bf\u03c5\u03c2 \u03bc\u03c0\u03ac\u03c4\u03c3\u03bf\u03c5\u03c2 \u03bd\u03b1 \u03be\u03b5\u03c6\u03c4\u03b9\u03bb\u03af\u03b6\u03bf\u03c5\u03bd \u03bc\u03b5\u03c4\u03b1\u03bd\u03ac\u03c3\u03c4\u03b5\u03c2 \u03c3\u03c4\u03bf \u03ba\u03ad\u03bd\u03c4\u03c1\u03bf \u03c4\u03b7\u03c2 \u0391\u03b8\u03ae\u03bd\u03b1\u03c2, \u03b5\u03af\u03c7\u03b1\u03bc\u03b5 \u03b4\u03b5\u03b9 \u03c4\u03bf\u03bd \u03c4\u03c1\u03cc\u03c0\u03bf \u03bc\u03b5 \u03c4\u03bf\u03bd \u03bf\u03c0\u03bf\u03af\u03bf \u03c3\u03c5\u03bc\u03c0\u03b5\u03c1\u03b9\u03c6\u03b5\u03c1\u03cc\u03bd\u03c4\u03bf\u03c5\u03c3\u03b1\u03bd \u03c3\u03c4\u03bf\u03c5\u03c2 \u03c4\u03bf\u03be\u03b9\u03ba\u03cc-\u03b5\u03be\u03b1\u03c1\u03c4\u03b7\u03bc\u03ad\u03bd\u03bf\u03c5\u03c2 \u03ba\u03b1\u03b9 \u03c3\u03c4\u03bf\u03c5\u03c2 \u03ac\u03c3\u03c4\u03b5\u03b3\u03bf\u03c5\u03c2 \u03c0\u03c1\u03bf\u03c3\u03b2\u03ac\u03bb\u03bb\u03bf\u03bd\u03c4\u03b1\u03c2 \u03c4\u03bf\u03c5\u03c2. \u0391\u03c5\u03c4\u03ac \u03b2\u03ad\u03b2\u03b1\u03b9\u03b1 \u03b5\u03af\u03bd\u03b1\u03b9 \u03c0\u03c1\u03ac\u03b3\u03bc\u03b1\u03c4\u03b1 \u03c0\u03bf\u03c5 \u03bc\u03b5 \u03bc\u03b5\u03c1\u03b9\u03ba\u03ad\u03c2 \u03b2\u03cc\u03bb\u03c4\u03b5\u03c2 \u03c3\u03c4\u03bf \u03ba\u03ad\u03bd\u03c4\u03c1\u03bf \u03c4\u03b7\u03c2 \u0391\u03b8\u03ae\u03bd\u03b1\u03c2 \u03bc\u03c0\u03bf\u03c1\u03b5\u03af \u03bd\u03b1 \u03b4\u03b5\u03b9 \u03bf \u03ba\u03b1\u03b8\u03ad\u03bd\u03b1\u03c2. \u0397 \u03b1\u03bd\u03c4\u03af\u03c6\u03b1\u03c3\u03b7 \u03c0\u03bf\u03c5 \u03b2\u03b9\u03ce\u03bd\u03b1\u03bc\u03b5 \u03cc\u03bc\u03c9\u03c2 \u03ae\u03c4\u03b1\u03bd \u03cc\u03c4\u03b1\u03bd \u03c4\u03bf\u03c5\u03c2 \u03b2\u03bb\u03ad\u03c0\u03b1\u03bc\u03b5 \u03bd\u03b1 \u03c5\u03c0\u03bf\u03ba\u03bb\u03af\u03bd\u03bf\u03bd\u03c4\u03b1\u03b9 \u03ba\u03b1\u03b9 \u03bd\u03b1 \u03b3\u03bb\u03cd\u03c6\u03bf\u03c5\u03bd \u03c4\u03bf\u03c5\u03c2 \u03c0\u03bb\u03bf\u03cd\u03c3\u03b9\u03bf\u03c5\u03c2 \u03b5\u03ba\u03b5\u03af \u03c0\u03bf\u03c5 \u03bc\u03ad\u03bd\u03b1\u03bc\u03b5. \u0395\u03ba\u03b5\u03af \u03ba\u03b1\u03c4\u03b1\u03bb\u03ac\u03b2\u03b1\u03bc\u03b5 \u03c0\u03c1\u03b1\u03b3\u03bc\u03b1\u03c4\u03b9\u03ba\u03ac \u03c4\u03b9 \u03b4\u03b9\u03c0\u03c1\u03cc\u03c3\u03c9\u03c0\u03b1 \u03c3\u03ba\u03bf\u03c5\u03bb\u03ae\u03ba\u03b9\u03b1 \u03ba\u03b1\u03b9 \u03b8\u03c1\u03b1\u03c3\u03cd\u03b4\u03b5\u03b9\u03bb\u03bf\u03b9 \u03b5\u03af\u03bd\u03b1\u03b9 \u03b1\u03c5\u03c4\u03ae \u03b7 \u03c3\u03b9\u03c7\u03b1\u03bc\u03ad\u03bd\u03b7 \u03c6\u03ac\u03c1\u03b1"} +{"output_text": " run fake and throws a perfect pass to Matthews for the touchdown.\n\nOn this play, Tannehill throws a perfect pass to Matthews for the touchdown.\n\nWeek 5 vs Miami Dolphins 17-24 Loss\n\nRyan Tannehill\n\n24/43 266 yards\n\n2 Touchdowns\n\n1 Interception\n\nHere, Tannehill throws a perfect pass to Matthews for the touchdown.\n\nOn this play, Tannehill throws a perfect pass to Matthews", "input_text": "ceptions\n\nOn this play, Tannehill completes an easy out route to Rishard Matthews for the touchdown.\n\n\n\n\n\nWeek 2 @ Jacksonville Jaguars 23-20 Loss\n\nRyan Tannehill\n\n30/44 359 yards\n\n2 Touchdowns\n\n0 Interceptions\n\nHere, Tannehill feels the pressure and improvises. As he rolls to his left he sees Damien Williams crossing the field. He completes the pass near the sideline for the touchdown.\n\nStoneburner runs a nice slant route as Tannehill throws a laser passed the middle linebacker for the touchdown.\n\nWeek 3 vs Buffalo Bills 14-41 Loss\n\nRyan Tannehill\n\n26/49 297 yards\n\n2 Touchdowns\n\n3 Interceptions\n\nEven though the ball was thrown behind Landry, it was not Tannehill's fault the ball was intercepted. Preston Brown does a nice job of coming down with the loose football.\n\nThe blitzing safety comes in completely untouched, forcing Tannehill to get rid of the football. Unfortunately, he throws it right into the arms of Bills' linebacker, Preston Brown. Ugly play all around.\n\nTannehill feels the pressure coming from the left side and throws up a prayer, interception.\n\nOn this play, Tannehill executes a nice play action, throwing a near perfect pass to Rishard Matthews for the touchdown.\n\nOn 4th and 8, Tannehill has no decision but to give his receiver a chance. Matthews makes a nice adjustment to the under-thrown pass, resulting in a touchdown.\n\n\n\n\n\nWeek 4 vs New York Jets 14-27 Loss\n\nRyan Tannehill\n\n19/44 198 yards\n\n2 Touchdowns\n\n2 Interceptions\n\nHere, Tannehill sells the"} +{"output_text": " external stimuli, introverts draw their energy from internal stimuli.\n\nThis means that while extroverts may be more likely to be distracted by external stimuli, introverts are more likely to be distracted by internal stimuli.\n\nThis means that while extroverts may be more likely to be distracted by external stimuli, introverts are more likely to be distracted by internal stimuli.\n\nThis means that while extroverts may be more likely to be distracted by external stimuli, introverts are more", "input_text": " that surround them. Without understanding the introverted mindset and the way in which it manifests itself, there is a danger that some individuals may be considered as being shy, unemotional and even lacking in empathy.\n\nAccording to the Myers-Briggs organization, an estimated 50% of the population can be identified as introverted. Numerous scientific studies have explored the depths of the introverted mind, addressing some common misunderstandings, and offering insight into how introverts think.\n\nBy studying the introvert approach, it has shed some light on the key productivity secrets of introverts, and the unique attributes that can make them invaluable to an employer.\n\nHere are a few debunked perceptions about introverts, and reasons why these personality types make your office better:\n\nOutstanding Creative Thinkers\n\nThey may not be the first to throw a suggestion into the brainstorming ring, but introverts are excellent at synthesizing information and forming creative solutions.\n\nBack in 2012, Randy Barker of Harvard University discovered that introverts tended to have larger and thicker masses of grey matter in the prefrontal cortex. This is the area of the brain that drives abstract thinking and decision making, which is in turn crucial to the process of generating creative ideas and making informed judgements.\n\nSo while introverts may be more inclined to develop their ideas and streamline them internally before interacting with others, they often deliver a quality and clarity of thought that can underpin any creative campaign.\n\nIf you want to conduct efficient and productive brainstorming sessions, introverts should be empowered to participate regularly in these formats.\n\nNot Easily Distracted\n\nEveryone finds a loud office environment distracting, but some people may be wired to better tune out the noise.\n\nThe aforementioned Myers-Briggs organization has long since asserted that while extroverts draw their energy from"} +{"output_text": " compromised. He said that he had been using the Rest-client gem for a while and had been using it in a production environment. He said that he had been using the gem in a production environment for a while and had been using it in a production environment for a while. He said that he had been using the gem in a production environment for a while and had been using it in a production environment for a while. He said that he had been using the gem in a production environment for a", "input_text": " able to go for higher studies or enjoy a vacation because of the drinking habits of the bread winner\u201d, says Sebastian.But neither the Church homilies nor the president\u2019s horror seems to have any impact on the Malayali. With no major hurry in life, no great manufacturing rat race, plentiful annual inward remittances of about Rs 25,000 crore from around the world, and a good scenery outside the window, the average Malayali sees it as an ideal setting to raise another toast. The Gandhi statue can only stand stupefied.Year Petroleum Liquor2005-06 2,028 1,4222006-07 2,338 1,6942007-08 2,341 1,9762008-09 2,670 2,5092009-10 2,903 3,000 The Case for 2FA, Post Rest-client Gem CVE\n\nMost CVEs occur as a result of a oversight in the architecture or mishandling of how libraries may interact with your application. In some cases like what had occurred with the Rest-client gem version 1.6.13, a package maintainer account on https://rubygems.org was hijacked and used to push malicious code that would compromise sensitive credentials for payment manager accounts, database access, repository access, and others that can cause irreparable damages. The hijacker conducted a series of releases - 1.6.10, 1.6.11, 1.6.12, and 1.6.13 - all of which contained malicious code. This attack was also more elusive in that it was affecting a point release from a older version. This strategy could have been for a target using a version within 1.6.10-.\n\nWhat Possibly led to the CVE\n\nWe had a chance to speak to Matt Manning who provided some clues to what may have led to his account being"} +{"output_text": " slow descent. The crew cabin was separated from the Orbiter's aft fuselage by the remaining umbilical lines. The crew cabin was then thrown out of the Orbiter's aft end and fell back to Earth. The crew cabin was destroyed on impact with the ground.\n\nThe crew cabin was destroyed on impact with the ground. The crew cabin was destroyed on impact with the ground. The crew cabin was destroyed on impact with the ground. The crew cabin was destroyed on impact", "input_text": " into the flight, the Shuttle was travelling Mach 1.92 at an altitude of 46,000 ft (14,035 m), equating to a speed of over 1,250 mph (2,040 km/h). The continuing rotation of the right SRB pushed the Shuttle off course such that its nose was no longer pointed in the same direction as it was flying.\n\nThis motion at such high speed put intense aerodynamic loads on the Orbiter's airframe. The stresses these loads created were simply too great for the Shuttle to bear, and it quickly broke up into several large pieces. First to fail was the forward fuselage which broke away trailing a mass of umbilical lines pulled loose from the payload bay. The nose of the Orbiter also separated from the crew cabin and spilled the hypergolic nitrogen tetroxide fuel used in the reaction control system (RCS). This fuel ignited in a reddish-brown cloud that can be seen emerging from the cloud surrounding the disintegrating Shuttle.\n\n\n\nStructural breakup of the Orbiter\n\nThe remainder of the Orbiter, its forward end suddenly opened to the supersonic flow, blew apart from the inside out and threw up a hail of debris that emerged from the massive vapor cloud. Clearly visible in the debris raining down after the breakup were the crew cabin, left wing, and the aft fuselage containing the three main engines that were still firing using their remaining propellant. The two SRBs crossed paths and continued operating until 110 seconds after launch, when they were destroyed using onboard self-destruct explosives.\n\n\n\nOrbiter debris\n\nFate of the Crew:\n\nThe momentum of the crew cabin, containing the seven astronauts, carried it upward to an altitude of about 64,000 ft (19,525 m) before it began a"} +{"output_text": " FRBNY to inspect the gold. The Bundesbank has been in the process of moving its gold out of the FRBNY's vaults and into the Bundesbank's own vaults. The Bundesbank has been doing this for years, but the move is now complete.\n\nThe Bundesbank has been doing this for years, but the move is now complete.\n\nThe Bundesbank has been doing this for years, but the move is now complete.\n\nThe Bundesbank has", "input_text": " about central bank and government gold reserves. (Read more: GOP Appeased Me on Gold Standard: Rep. Ron Paul)\n\nThe gold of the United States government, which officials say is held in the United States Bullion Depository in Fort Knox, has been rumored\u2014probably since the Depository's founding in 1937\u2014to have been looted and replaced with gold-painted tungsten, for instance. The government treats this as nonsense\u2014which it most likely is\u2014but nonetheless it does conduct regularly scheduled audits of the gold in Fort Knox, including doing purity tests on a small sample of the gold.\n\nThis paranoia is not entirely irrational, for one reason. As I mentioned above, for almost all imaginable operational purposes, the actual existence of the gold in Fort Knox or in the vault beneath the FRBNY's Liberty Street headquarters is irrelevant. The bookkeeping is what really matters here. So long as the Fed says Bundesbank owns X tons of gold, the Bundesbank can act as if it did own the gold\u2014even if the gold had somehow been swallowed into a gold-eating galactic worm hole. But the irrelevance of the facticity of the gold does quite easily lend itself to thinking: if the gold being there doesn't matter, why would it be there?\n\n(Read more: The Gold Standard and the Myth of Price Stability)\n\nI'm sure the Bundesbank officials understand this quite well, even though the German Audit Court does not. There is nothing to be gained by inspecting the gold. If it is all there and pure, there is no difference from an undiscovered absence. But if the gold isn't there, well, calamity could follow as trust in the central bank gold depositories evaporated instantly.\n\nIn any event, it looks like Bundesbank officials will soon be visiting the"} +{"output_text": " to vote by affidavit, and because the state\u2019s election officials were so incompetent that they couldn\u2019t even get the machines to work.\n\nThe New York Times is \u201cNot Even Trying\u201d\n\nThe New York Times is not even trying to hide its bias. It\u2019s not even trying to hide its contempt for the voters of New York. It\u2019s not even trying to hide its contempt for the voters of New York. It\u2019s not even trying to hide its contempt for", "input_text": " is absurdly unrealistic, so the hearing had to be postponed. This meant that literally millions of voters were unsure whether the election would be open and their votes would count. The judge said it was okay because voters already had a remedy; they simply had to go to a judge and get a court order allowing them to vote.\n\nNo Working Machines\n\nEven when polling places opened on time, there were reports of locations without a single working machine. So what\u2019s the solution? Oh yeah, an affidavit. Of course. Or wait until a new machine is brought in, not knowing how long that wait will be. Again, people had to leave because of work.\n\nVoters Listed in Wrong Location\n\nAs I write this list, my anger and depression are hitting me all over again. There are so many stories of voters going to polling locations only to not be listed. Here, take an affidavit, they were told, but many would-be voters were more stalwart and refused. Sometimes they turned out to magically be listed after all, and other times they were listed in different districts, despite having no change of address or anything\u2014they were simply mis-listed. Or sometimes, literally everyone at the location wasn\u2019t listed and was forced to fill out our good friend the affidavit.\n\nNew York has \u201c2nd Lowest Voter Turnout So Far\u201d\n\nIt\u2019s absurd that New York is only ahead of Louisiana in terms of voter turnout. Only 19.7% of eligible New York Democratic voters cast a ballot. You know why the title of this section is in quotes? It\u2019s because New York\u2019s turnout was actually higher than usual and above what the official numbers indicate (here\u2019s an article literally titled \u201cVoter turnout unusually high in New York\u201d). It\u2019s because tens or hundreds of thousands of people were forced"} +{"output_text": " Iran. The U.S. had won the last two matches, including a thrilling victory over the host nation in the semifinals. Iran had won the last two matches, including a thrilling victory over the host nation in the semifinals.The U.S. had a slight edge in the first period, but Iran scored the first points of the match. The U.S. wrestlers were able to score a takedown in the second period, but Iran scored a takedown of", "input_text": "1.1 CPU: 2.3 GHz quad-core Snapdragon 801\n\n2.3 GHz quad-core Snapdragon 801 Screen: 5-inch 1920 x 1080 LCD display (441 PPI)\n\n5-inch 1920 x 1080 LCD display (441 PPI) RAM: 2GB\n\n2GB Storage: 32GB\n\n32GB Camera: 4 \"UltraPixel\" rear / 5MP front\n\n4 \"UltraPixel\" rear / 5MP front Battery: 2600 mAh Li-Po\n\n2600 mAh Li-Po Dimensions: 5.76 x 2.78 x 0.37 in\n\n5.76 x 2.78 x 0.37 in Weight: 5.64 ounces\n\n5.64 ounces Price: $100 (32GB) on contract with Verizon Frank Molinaro celebrates after beating two-time world medalist Sayed Mohammadi of Iran (Photo/John Sachs, Tech-Fall.com) Frank Molinaro celebrates after beating two-time world medalist Sayed Mohammadi of Iran (Photo/John Sachs, Tech-Fall.com)\n\nJames Green was undefeated at the Freestyle World Cup (Photo/John Sachs, Tech-Fall.com) James Green was undefeated at the Freestyle World Cup (Photo/John Sachs, Tech-Fall.com)\n\n3rd Place Match\n\nPool Final\n\nLOS ANGELES -- Despite some amazing performances, the U.S. freestyle wrestling team lost two matches in the final day of the World Cup to finish in fourth place.Against Iran and Georgia, both matches ended with each team winning four bouts, but with USA earning fewer classification points and losing on criteria.There was a palpable anticipation at the Forum in Inglewood heading into the matchup between USA and"} +{"output_text": " in the 1970s.\n\n\u201cI\u2019m not gonna lie, I was a little bit upset,\u201d Cousins said. \u201cI was a little bit upset with myself. I was a little bit upset with the way I played. I was a little bit upset with the way I reacted to certain things. I was a little bit upset with the way I played. I was a little bit upset with the way I reacted to certain things. I was a little bit upset with the way I", "input_text": "ity, it\u2019s not. It sucks in the moment, but you overcome in that moment. You overcome in that adversity. It builds you as a person. It\u2019s almost like putting armor on. It\u2019s putting armor on over your body for that next situation, \u2019cause it\u2019s not gonna be the end. That next situation, you\u2019re much stronger. You\u2019re more prepared and tough that next moment. And then the process starts over. I think adversity builds a person.\u201d\n\nEarlier in the season, he had played a role in the locker room, intervening to help calm a dust-up between Green, whom he\u2019d known from the 2016 gold medal-winning U.S. Olympic team, and Kevin Durant. \u201cHe was one of the guys talking to everybody in that situation, one of the guys making everything right,\u201d Green said.\n\nHis return began helping the Warriors on the court as well. The team won nine out of 10 games after Cousins\u2019 return. But by mid-March, Golden State was going through a lackluster stretch, having lost two of their previous three games, including a 128-95 blowout loss to the Boston Celtics at home on March 5 in which Cousins got a technical for a shoving match with the Celtics\u2019 Terry Rozier. He also was getting dinged in the media with questions about his mobility and defensive shortcomings.\n\nThe Boogie of old might have felt a type of way about that.\n\nThat guy would run the fast break, catch balls in traffic, rebound every board. But he was perpetually aggrieved, always squaring up against players he thought took cheap shots, and fighting with refs and teammates. He made history as the fastest player to hit 16 technical fouls and earn a one-game suspension since the NBA instituted the rule"} +{"output_text": " philosophies of mind that ignore the very existence of the brain; the philosophies of mind that ignore the very existence of the mind; the philosophies of mind that ignore the very existence of the self; the philosophies of mind that ignore the very existence of the body; the philosophies of mind that ignore the very existence of the world; the philosophies of mind that ignore the very existence of the universe; the philosophies of mind that ignore the very existence of the cosmos; the philosoph", "input_text": " those of Aristotle, Aquinas, Leibniz, Wolff, Kant, Hegel, or Lotze. The price has been diffidence for any attempts to build philosophical systems, and the concomitant preference for the brief essay or even the aphorism. Nowadays the expression esprit de systeme is used in a pejorative sense. But this diffidence is as unreasonable as it would be to mistrust physics or engineering because sometimes they fail. What is wrong is not to systematize (organize) ideas, but to cling dogmatically to this or that product of such effort. It is wrong because all things and all ideas come in systems. We ought to systematize ideas because stray ideas are unintelligible; because we need logical consistency; because deductive power is desirable; and because the world is not a pile of unrelated facts but a system of interrelated things and processes. In context, every idea drags other ideas. For example, every concept of truth involves the concepts of proposition and meaning. Second example: Relativistic physics has taught us that the notion of time must be treated in combination with the ideas of space, matter, and event. Third: The idea of human action relates the concepts of person, intention, value, goal, norm, outcome, social environment, and circumstance. In short, we need systems of ideas in all fields of learning and all walks of life, because the world is a system, our knowledge another system, and living involves interacting with systems. Why should philosophy be the exception? Just because the puny and ephemeral is easier than the great and durable? (9) Detachment from the intellectual engines of modern civilization. These engines are science, technology, and ideology. Detachment from them expedites wild and anachronistic speculation. Examples: The philosophies of mind that ignore the very existence of cognitive neuroscience; the"} +{"output_text": " along with Phil Keoghan, are about to take on the longest leg in the history of the race tonight on The Amazing Race 2013, so prepare yourself for tears and breakdowns! Watch it all happen during our The Amazing Race 22 Live Recap and see who was eliminated from The Amazing Race 2013 with us!\n\nThe teams, along with Phil Keoghan, are about to take on the longest leg in the history of the race tonight on The Amazing Race 2013, so prepare yourself for", "input_text": "quux\"}>\n\nLicense\n\nCopyright (c) 2018-2019 Dr. Ralf S. Engelschall (http://engelschall.com/)\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. March 10, 2013 at 7:00 PM EDT\n\nPhil Keoghan threw a wrench in the mix last week on The Amazing Race 22 and the leg of the race continued from the check-in with him, but things were left up in the air as Dave battled his ruptured Achilles tendon and unsure if he could continue on in The Amazing Race Season 22. The teams are about to take on the longest leg in the history of the race tonight on The Amazing Race 2013, so prepare yourself for tears and breakdowns! Watch it all happen during our The Amazing Race 22 Live Recap and see who was eliminated from The Amazing Race 2013 with us!\n\nThe teams,"} +{"output_text": " as the training exercise, were not connected.\n\n\u201cI was just trying to get people to be prepared for the possibility of a protest,\u201d he said. \u201cI was not trying to get people to be prepared for a protest.\u201d\n\nThe sheriff\u2019s department also sent out a mass email to its officers in early March, instructing them to be prepared for protests.\n\n\u201cWe are expecting protests in the coming weeks,\u201d the email said. \u201cWe are asking that you be prepared", "input_text": " neighboring county in 2016. In an interview with The Intercept, Gatterman acknowledged that his note might have been mistaken and the incident he referenced was likely one carried out by a separate group.\n\n\u201cWe are working in solidarity with the tribal groups and their nations that are standing against the destruction of their homelands,\u201d said Michelle Vendiola, a Walker River Paiute tribal member from Washington\u2019s Lummi Reservation and a founder of Red Line Salish Sea. \u201cWe have brought people up to British Columbia whenever the call is put out that that\u2019s what\u2019s needed.\u201d\n\nVendiola is intimately familiar with Whatcom County law enforcement officials\u2019 monitoring of anti-pipeline activists. Local prosecutors obtained a search warrant to collect private information from the Facebook page of the Bellingham #NoDAPL Coalition as part of an investigation into the I-5 blockade that ultimately led to criminal charges against Vendiola and six others.\n\n\u201cIt\u2019s crazy to think about how prepared they are getting to try to quash dissent here in Washington,\u201d Vendiola said.\n\nAs March wore on, law enforcement\u2019s efforts to prepare for large protests continued. An email Gatterman sent a few weeks after the Vancouver-area protests provided instructions for around 50 role players who would pretend to be protesters during a joint training exercise at Whatcom Community College involving around 30 officers from the sheriff\u2019s department and the Washington State Patrol.\n\n\u201cSome of you will be carrying signs, some of you will be locked together in what is called a \u2018sleeping dragon,\u2019\u201d Gatterman wrote. \u201cWe are looking for a few role players to dress in all black to simulate an anarchist group. This group also covers their faces with masks or bandanas.\u201d\n\nGatterman told The Intercept that the I-5 protest, as well"} +{"output_text": ". First, Trinity is not a fan of Cypher. Second, Trinity is not a fan of the Matrix. Third, Trinity is not a fan of the machines.\n\nThe Oracle\n\nThe Oracle is a character that is still in the Matrix. She is a prophet, of sorts, who has been telling Morpheus about The One. She has told Trinity that she would fall in love with The One.\n\nThe Oracle is a character that is still in the Matrix. She", "input_text": " \u2013 He\u2019s the pilot of the ship. He\u2019s Tank\u2019s brother and is also not from the Matrix.\n\nThe Oracle\n\nThis team has been using the guidance of a person called the Oracle (Gloria Foster). The Oracle is someone still residing in the Matrix. She is a prophet, of sorts, who has been telling Morpheus about The One. People who have been freed are taken to the Oracle and she advises them on things to come. The Oracle has told Morpheus that he will find The One. She has told Trinity that she would fall in love with The One.\n\nThe Matrix Explained: Introduction To The One\n\nThe prophecy stated by the Oracle is that the team will eventually find a person that they will free from the Matrix. He will own the ability to manipulate the Matrix to an extraordinary degree. We\u2019ll talk about this more later. Just know this that Morpheus and his crew have already been looking for The One.\n\nThe movie begins with Trinity and Cypher talking. This is the dialogue:\n\nCypher: You like him, don\u2019t you? You like watching him.\n\nTrinity: Don\u2019t be ridiculous.\n\nCypher: We\u2019re going to kill him.\n\nTrinity: Morpheus believes he is the One.\n\nCypher: Do you?\n\nTrinity: It doesn\u2019t matter what I believe.\n\nCypher: You don\u2019t, do you?\n\nTrinity: Did you hear that?\n\nCypher: Hear what?\n\nTrinity: Are you sure this line is clean?\n\nCypher: Yeah, of course I\u2019m sure.\n\nTrinity: I better go.\n\nA couple of things come out straight from here"} +{"output_text": " think about the future of the planet.\n\nThe rural-urban divide is not just a matter of geography. It is also a matter of values.\n\nThe rural-urban divide is not just a matter of geography. It is also a matter of values.\n\nThe rural-urban divide is not just a matter of geography. It is also a matter of values.\n\nThe rural-urban divide is not just a matter of geography. It is also a matter of values.", "input_text": " Mike Babcock. All hail Mike Babcock.\n\nKatya:\n\nSpecies: Make him the damn captain already.\n\nShut Up and Show Us Some of Those Sweet, Sweet Matthews Goals Dammit!\n\nFor everyone who skipped my blah blah blah above, this is where you should stop scrolling to relive the great Matthews goals: Not long ago on a farm south of Fresno, I watched a poorly paid mechanic in silence repair a gate\u2019s hydraulic ram as easily and rapidly as if he were Googling on a smartphone. He seemed to me a genius in oily clothes engulfed in a cloud of cigarette smoke.\n\nLater that same day, in Palo Alto, I talked to lots of mellifluent and highly compensated academics theorize about politics. I wondered whether they could tell hydraulic fluid from the engine oil in their imported cars. Who is really wise, who not?\n\nA red/blue political map of the 2016 election reflects these two antithetical worlds. Eighty-five percent of geographical America voted for Donald Trump. But more than half the country\u2019s voters living in just 15 percent of its land area went for Hillary Clinton.\n\nHow did we split into two countries? Why does rural America vote more conservative than liberal?\n\nThose in rural and small-town America \u2014 who were more likely to pump their own water, to worry about their septic tank and to fret whether the weather will allow them to profit or lose money \u2014 think, talk and vote differently from those who expect the tap always to flow, the toilet to flush regularly and to get paid on time, rain or shine, drought or flood.\n\nPragmatic, autonomous and struggling people of the countryside think about building new dams and freeways to match population growth; affluent urbanites and suburbanites, with the greater luxury of second and third chances,"} +{"output_text": " soup, pork knuckle and veal goulash. The menu is divided into two sections: the \u201cGr\u00fcnauer\u201d and the \u201cKronenberger.\u201d The former is a collection of Austrian classics, like Wiener schnitzel and goulash, while the latter is a collection of German classics, like pork knuckle and veal goulash. The Kronenberger is a bit more expensive, but it\u2019s worth it for the goulash, which", "input_text": " kitchen. Chances are one of your servers will be sisters Sharon or Susan Kwon, who own the restaurant with their mother and head chef, Suzanna Kwon. For the last decade, the Kwon family has been serving up traditional Korean dishes like bibimbap \u2014 a satisfying bowl of bulgogi (thin, grilled strips of marinated beef), bright vegetables and rice \u2014 and addictive japchae, where warm, translucent rice noodles tangle with sliced mushrooms and chunks of beef. NG\n\n30. El Pollo Rey\n\n901 Kansas Ave., KCK. | Inexpensive.\n\nEl Pollo Rey\u2019s menu is just three lines long. This busy restaurant on the Kansas Avenue strip of KCK only sells chicken, which comes by the whole, half or wing. The birds are scrunched together on a wood-fired grill and given a slow, smokey char. Then, the pollo is plated with rice, beans and warm corn tortillas. On the side come a Ziplock baggie of onions and a little styrofoam cup filled with a red salsa blended down to pulp. The Chicken King\u2019s smoky, blacked birds fall apart with the poke of a fork, which has won the restaurant an enthusiastic local following. Be warned that you might be waiting for a seat on weekends. MC\n\n31. Gr\u00fcnauer\n\n101 W. 22nd St., KCMO. | Expensive.\n\nWhat was life like for a minor Hapsburg of the late empire? This we cannot know. However, we have some idea thanks to this palatial Austrian-German dining room in the Crossroads\u2019 Freight House. The wood-roofed Gr\u00fcnauer feels like a chalet perched on some sheer Alpine cliff, and it delivers refined versions of hearty fare like wild mushroom"} +{"output_text": "climbs up the ladder] [climbs down the ladder] [climbs up the ladder] [climbs down the ladder] [climbs up the ladder] [climbs down the ladder] [climbs up the ladder] [climbs down the ladder] [climbs up the ladder] [climbs down the ladder] [climbs up the ladder] [climbs down the ladder] [climbs up", "input_text": " Ali Baba. Open sesame! [shoots the door three times] Fine, then. Close sesame! Man, three rounds of buckshot point-blank. What the hell? It's like one of those doors from Looney Tunes where they blow up the whole building but the door is still standing. I bet it's locked on both sides and nobody has the key.\n\n[preparing to jump over an elevator shaft to a ladder] Freeman: This, right here, is why you should eat Wheaties in the morning. I guess anything would be better than the two shots of vodka I had. All right, let's do this. [sprints] HOO-ga-sa-ka HOO-ga-sa-ka HOO! [leaps, falls] Oh, shit, oh shit, OH SHIT! [splat] [flatline] HEV: HEV activated. Automatic medical systems engaged. Major fracture detected. Internal bleeding detected. Emergency: user death imminent.\n\nEpisode 11 [ edit ]\n\n[Gordon encounters an open elevator shaft, with the only ladder up on the opposite side] Freeman: So, my only way out of here is to take some flying leap of faith, like that scene in Indiana Jones and the Last Crusade, then claw like a mad cat, and hope like hell I get a grip and don't break my ribs! Once again, I need a grappling hook. I can't believe this. Why do you have a ladder in an elevator shaft? To fix the elevator! How do you get to the ladder? You take the elevator that doesn't work! Who thought this one up? Jesus Christ! I suppose I could do the math on whether this jump is feasible or not, but, y'know... I'll have plenty of time for that when I'm dead. ["} +{"output_text": ": [laughs]\n\nFreeman: [to a marine] Now, I be a man of my word. I be a man of my word. I be a man of my word. I be a man of my word. I be a man of my word. I be a man of my word. I be a man of my word. I be a man of my word. I be a man of my word. I be a man of my word. I be a man", "input_text": " one and negative one. [shoots again, killing it] Now it's negative two, and me. But wait, wouldn't I be number one? I don't know what the hell I'm talking about. See, this is why you have to define your terms. If you don't, people die.\n\nEpisode 27 (April Fool's 2010) [ edit ]\n\nFreeman: 'Tis true of the\u2014 [hears a noise from behind him] What be that noise? Arr, these caves be haunted, says I. But livin' and dead alike shall bow before the great Cap'n Freeman.\n\nFreeman: [Shoots a marine with a shotgun] Arr, the bloodlettin' be flowin' over. I shows no quarter to lubbers such as thee. Ye calls yerselves marines but mariners I says yer not. I bet none of ye could rig a bunt-gasket 'round a mast and jigger if yer lives were hangin' in the balance!\n\nFreeman: Now what'a we have here? An anti-scurvy machine. [headcrabs fall from the ceiling tiles] Shiver me timbers! [shoots them] By the powers! There be all manner o' queer beasties in this hold! I cares not for 'em.\n\nFreeman: [speaking to a security guard] Ho thar, squire! What say ye to joinin' me crew? I gives ye my affidavi' I give ye yer cut of any loot we take. Guard: Okay, why not? Freeman: YA\u2014 [is interrupted by the guard] Guard: Didn't want to die alone anyway. Freeman: YARR! That be the spirit! Let us charge forth and paint the walls red with blood! Guard"} +{"output_text": " the one to flip, and he wanted to be the one to vote out Elizabeth. I told him that I would be fine with that, and I would vote for him.\n\nI was told by Missy that she was going to vote for Tommy, and she was going to vote for me. I told her that I would be fine with that, and I would vote for her. I was told by Elizabeth that she was going to vote for Tommy, and she was going to vote for", "input_text": " 12 to the merge, I had never been made aware of any issues involving him from early on in the game with original Vokai. I never heard any chatter from new Vokai that anyone felt uncomfortable with his actions. I\u2019m not saying this didn\u2019t happen; I want to be clear. I\u2019m saying I had no knowledge of any issues with Dan\u2019s behavior. Dan vocalized to me on several occasions that the only way he would flip with the four of us (Elaine, Missy, Elizabeth, and myself) was if Kellee and Janet agreed as they were his core alliance. This was understood.\n\nSo what happened at the merge?\n\nFast forward to the merge day. Right out of the gate, I was told by several people myself, Missy, and Elizabeth would be the first targets as directed by Janet and Kellee. We were told that Janet\u2019s job was to woo Elizabeth and make her feel comfortable as they were both swimmers, and Kellee was going to do the same with Missy. Tommy confirmed all of this with me and apologized because Kellee and Janet told him they would not flip to work with us, meaning we were going to be voted out per Janet and Kellee\u2019s direction. I relayed this information to Missy and Elizabeth of what Janet and Kellee\u2019s plans were.\n\nThe three of us had our backs against the wall, and we all started to scramble. I filled Tommy in on Jamal\u2019s misplayed idol, and shortly after, Tommy came back to me and told me he and Lauren were flipping, and the vote was now going to be Kellee. Not that I cared for justification at that point, as my fate was in his hands, but he explained that he could no longer trust her. He also explained that he didn\u2019t want to be"} +{"output_text": ") and the recruitment of players from abroad (such as the \u00a330m signing of Brazilian midfielder Paulinho from Corinthians), they have become a team that can compete with the best in Europe.\n\nSecond, they have been able to do this without the need to spend big on players.\n\nThird, they have been able to do this without the need to spend big on players.\n\nThe first point is the most important.\n\nTottenham have been able to compete with the", "input_text": " a lack of trophies since the 47-year-old joined the club from Southampton in 2014.\n\nYet it is hardly the case that the Argentine is failing where others have succeeded.\n\nSince beating Anderlecht on penalties in the 1984 Uefa Cup final, Tottenham have won one FA Cup and two League Cups, the most recent of which came in 2008.\n\nTheir status as one of England's original \"big four\" along with Arsenal, Liverpool and Manchester United comes from a 12-year period between 1956 and 1967. The high point was becoming the third side to complete the English domestic double in 1961.\n\nIn addition, they won the FA Cup in 1962, the European Cup Winners' Cup a year later and the FA Cup again in 1967. During that period they were second in the old First Division twice, third four times and registered two more top-six finishes.\n\nAnd between 1959 and 1962, Tottenham had the highest average attendance in England.\n\nIt is against that glorious past that Pochettino appears to be being judged, even though the last quarter of a century has been a relative struggle.\n\nUntil 2017, when he finally broke the cycle, Tottenham had finished behind north London rivals Arsenal 21 seasons in succession. Between 1991 and 2009, they never finished higher than fifth, which they did twice. On seven occasions, they ended the season below 10th.\n\nIt is this recent history that Pochettino feels justifies him saying \"we are not a big team\".\n\nA transition to the European elite - on and off the pitch\n\nEveryone here wants to win trophies, says Winks\n\nThe change in perception of Tottenham is down to three things.\n\nFirst, through a combination of excellent home-grown talent (England skipper Harry Kane and midfielder Harry Winks"} +{"output_text": " additional 100 people to march.\n\n\u201cI\u2019m not a big social media person, but I\u2019m a big organizer,\u2019\u2019 Villela said. \u201cI\u2019m not a big Facebook person, but I\u2019m a big organizer.\u2019\u2019\n\nVillela, who lives in the Houston Heights, is a member of the Houston chapter of the Democratic Socialists of America, a group that has been growing in the city.\n\n\u201cI\u2019m not a socialist,", "input_text": " plays coy on making Super Bowl halftime show statement\n\nUSA TODAY Sports' Super Bowl LI predictions: Patriots or Falcons?\n\nHouston police chief Art Acevedo said his department won\u2019t be caught flat-footed by the protests.\n\n\"I\u2019m sure there will be more,\u2019\u2019 he told USA TODAY Sports earlier this week. \u201cYou always prepare for them to happen.\u2019\u2019\n\nThree will start at city hall and embark on marches, two ending near the NFL\u2019s temporary headquarters downtown and one ending at at NRG Stadium. Another rally will start at the stadium and is expected to merge with other groups.\n\nDetails of the protests can be found on the #ResistHouston Facebook page, where almost 1,800 people have indicated they will participate and another 6,800 people have indicated they are \u201cinterested.\u201d\n\nNEW RECRUITS\n\nWhen it comes to the organizers, it\u2019s not just the usual suspects anymore. Melanie Villela, 30, a mother of three who is a mechanical drafter, has emerged as one of the unlikely leaders.\n\n\u201cI feel this is more important than watching a bunch of grown men playing football,\u2019\u2019 Villela told USA TODAY Sports earlier this week, \u201cand I know our strength is in numbers.\u2019\u2019\n\nVillela was less optimistic last week when she went online in search of a march scheduled for Saturday, a day she could juggle parental responsibilities and protest. She found nothing, so she decided to organize her own rally and created a Facebook page. A few hours later, she checked the page and its progress.\n\nShe was the only confirmed participant.\n\nThe next morning, she got a call from her excited mother. Almost 100 participants had confirmed. By Thursday there were more than 700 people committed to participate, and Villela also secured an"} +{"output_text": "\nThe game is missing a few key things. On the first page the text says to talk with the other players and figure out who your characters are and what sort of journey this is. That\u2019s it: you just make it up with no guidance whatsoever. For a game this surreal, I\u2019m not sure that\u2019s going to cut it. The players need to know how to make characters that fit with what this game is trying to do. Additionally, I wonder about deciding the characters", "input_text": " towards it, or the Attachment, holding on to things here. There are also random elements called forks (\u201cSomething burns!\u201d) that can be added to later scenes. There\u2019s an endgame that determines your character\u2019s final outcome based on how many cards you received from the players on either side, when you were playing the Attachment and Enlightenment. This is all pretty neat. Completeness: But the game is missing a few key things. On the first page the text says to talk with the other players and figure out who your characters are and what sort of journey this is. That\u2019s it: you just make it up with no guidance whatsoever. For a game this surreal, I\u2019m not sure that\u2019s going to cut it. The players need to know how to make characters that fit with what this game is trying to do. Additionally, I wonder about deciding the characters ultimate fate based on the choices made by other characters based on how the players portrayed abstract forces. That is completely arbitrary and unconnected to the actual characters choices, which adds to the randomness but doesn\u2019t really feel appropriate if the game is really about choice. Finally, it\u2019s great that characters get to choose which spiritual direction a given scene pulls them in, without any randomness or mechanical resolution, but the simplicity of this also leaves the effect of that choice on the endgame completely transparent, since the cards are all lying on the table. As a player, I know what the different choice options will mean for the players on either side of me when the endgame comes and that serves as a distraction from the workings of the scene itself and freely making that choice. I also feel like, for a game about spiritual salvation, the choice itself is pretty toothless, right? What\u2019s the difference between choosing between them? Is choosing Enlightenment really just that easy?\n"} +{"output_text": " that Westeros was \u201ca great land\u201d and that the Wall was \u201ca long way\u201d from the south coast of Dorne.\n\nThe maps for A Game of Thrones were redrawn in the early 2000s, and the scale bar was added. This was done to make the maps more accurate, but also to make the maps more useful for readers. The scale bar was added to the maps to make it easier for readers to determine the size of Westeros.\n", "input_text": " similar ones) may provide just the clarity you need to start building momentum!\n\n(Side note: If you\u2019re still struggling to define your financial values, watching these TED Talks on money could help!)\n\nIf you found this post insightful, you\u2019ll want to sign up for our email newsletter to receive our FREE \u201cBreakthroughs for Beginners\u201d email course. You\u2019ll get a PDF version of the questions included in this post as well as other helpful printables! Having determined a (working rough) basis for the size of the planet, we need next to determine the size of Westeros itself, the continent on which most of the action in A Song of Ice and Fire takes place, and determine its location on the globe. We know it is in the northern hemisphere, but not how far it extends.\n\nThe original maps for A Game of Thrones did not include a scale bar, and many readers assumed that Westeros was somewhat small, maybe not much larger than Great Britain. This was backed up by the comparisons with English medieval history, particularly the Wars of the Roses. However, more attentive readers noted that the Wall was said to be 300 miles long and extrapolated from this that Westeros was considerably larger, a full-sized continent rather than an island or peninsula. This was backed up by the extensive travel times for King Robert\u2019s party from King\u2019s Landing to Winterfell and back again. Very rough calculations using the Wall as a scale bar suggested that Westeros measured in fact about 3,000 miles from the Wall to the south coast of Dorne.\n\nMartin was wary of absolutely pinning down the size of Westeros and travel times, fearing that sharp-eyed readers would pick up on unintended mistakes and errors in the text and complain about them. He decided to be vague, stating"} +{"output_text": " \u201eKameradschafts-Magazins\u201c \u201eKameradschaft\u201c, zu Gast.\n\nAuch der ehemalige Fu\u00dfball-Profi und Ex-Nationaltorwart Michael K\u00f6llner war anwesend. Er hatte sich in der Vergangenheit mit dem Verein \u201eChemnitzer FC\u201c und dem Verein \u201eFC Magdeburg\u201c in Verbindung gebracht.\n\nAuch der e", "input_text": " den gewaltt\u00e4tigen Ausschreitungen vor gut einem halben Jahr erneut in die Schlagzeilen. Einmal mehr offenbarte sich das alte Problem mit rechten Strukturen im Fu\u00dfball. Welche Lehren der Verein daraus zieht, erscheint mehr als offen.\n\nHooligans aus Magdeburg und Berlin legten einen Kranz nieder\n\nAn die 1.000 Personen marschierten am Montagnachmittag in Chemnitz auf, um der dahin geschiedenen Szenegr\u00f6\u00dfe Thomas Haller die letzte Ehre zu erweisen. Zuvor fand die offizielle Trauerfeier, also im famili\u00e4ren Rahmen, auf dem st\u00e4dtischen Friedhof statt. Nach kurzem Marsch zur Chemnitzer Michaeliskirche reihte sich die rechte Trauergemeinde vor dem Gel\u00e4nde des Gotteshauses auf. Abgesehen von Drohungen und P\u00f6beleien gegen Pressevertreter blieb der Aufmarsch ohne Zwischenf\u00e4lle.\n\nDie G\u00e4ste: Neonazis und rechte Fu\u00dfballfans. Unter ihnen fanden sich nicht nur lokale Szeneg\u00e4nger, wie ehemalige Mitglieder der inzwischen aufgel\u00f6sten Neonazi-Hipster-Gruppierung \u201eRechtes Plenum\u201c, sondern auch rechtsextreme Kampfsportler und bekannte Gesichter aus der Rechtsrock-Szene. Neben Ex-Landser-Frontmann Michael Regener kam auch Yves Rahmel, langj\u00e4hriger Inhaber des"} +{"output_text": " to the Senate until the Senate had a chance to hold a trial.\n\n\u201cWe\u2019re not going to do it until we have a trial,\u201d she said. \u201cWe\u2019re not going to do it until we have a fair trial.\u201d\n\nThe Senate majority leader, Mitch McConnell, has said he will not hold a trial until the House has passed articles of impeachment. Photograph: Chip Somodevilla/Getty Images\n\nThe House has passed two articles of impeachment, one for", "input_text": " at the trial is prescribed in the constitution but whose actual role appears largely up to his own discretion.\n\n\u201cThe rules give us a lot of specific stage instructions about what certain actors are supposed to say, the oaths they take, and at what time of day certain events are supposed to occur,\u201d said Hilary Hurd, a JD candidate at Harvard Law who has dissected the procedure for the Lawfare blog.\n\nEven in spite of the president\u2019s clear misconduct, everything for Senate Republicans comes down to politics right now Elliot Williams\n\n\u201cBut when it comes to the major aspects of how this trial is actually going to work, it lays out some guidelines, but they\u2019re default guidelines that can be changed with 51 votes. So the twists and turns of the actual trial are not something that can be envisioned by just reading the rules.\u201d\n\nIn the weeks leading up to impeachment, an intriguing split developed between McConnell, with his quickie game plan, and Trump, who has expressed a desire to produce the trial as reality TV tailored for conspiracy theorists and Trump superfans.\n\nTrump dared the House to impeach him \u201cfast\u201d, \u201cso we can have a fair trial in the Senate,\u201d with a parade of witnesses who would \u201creveal, for the first time, how corrupt our system really is.\u201d The president\u2019s draft witness list includes House intelligence chair Adam Schiff, House speaker Nancy Pelosi, Joe and Hunter Biden \u201cand many more\u201d.\n\nIn reply, Senate Democrats have proposed a witness list of their own, including former national security adviser John Bolton, acting chief of staff Mick Mulvaney and two officials from the office of management and budget, all of whom the White House blocked from testifying previously.\n\nHouse speaker Nancy Pelosi injected added uncertainty to the process on the night of Trump\u2019s impeachment, by saying the House would wait to refer articles of impeachment"} +{"output_text": "\n\n\u201cI\u2019m sorry,\u201d Laura wrote back. \u201cI\u2019m so sorry.\u201d\n\n\u201cI know,\u201d Samantha wrote. \u201cI\u2019m sorry too.\u201d\n\n***\n\nChaya\u2019s death was a shock to the family. Her sister Samantha, who was in the room when she died, told me that she had been in the hospital for a week before she died. She was in a coma for a while, and then she was in a coma for", "input_text": ", pajamas, little trinkets, and stuffed animals.\n\nOn the night with Jeremy at the bar, I motioned to the restaurant across the street. Chaya and I had once gone there for drinks and guacamole. Jeremy was shocked that Chaya drank; he couldn\u2019t recall her ever doing that in his presence. That got me doubting my own memory \u2014 had Chaya actually had an alcoholic beverage? Or did she order something else? But there was definitely guacamole. That much I\u2019m sure of. After a brief pause, I said, \u201cI think she had a drink. I know I paid.\u201d\n\n***\n\nNearly five years after first meeting Chaya, my friend Laura texted me a screenshot of a conversation she had with Samantha, Chaya\u2019s younger sister, on Facebook. Though the chat was conducted through Chaya\u2019s own Facebook account, we had all been under the impression that her sister \u2014 not Chaya \u2014 was in control of her digital life since her health had taken a turn for the terminal. This process happened gradually; first the emails were signed by both Chaya and Samantha. Then, as her health worsened, Samantha was the sole writer, explaining that Chaya didn\u2019t really use the computer much anymore and starting chats with \u201cSamantha here\u201d or \u201cIt\u2019s Samantha.\u201d After a while, you didn\u2019t question it.\n\nKickstart your weekend reading by getting the week\u2019s best Longreads delivered to your inbox every Friday afternoon. Sign up\n\nLaura had privately reached out to offer support after reading a particularly grim update to Chaya\u2019s Facebook account, and Samantha wrote back, \u201cIf I wasn\u2019t living it, dint [sic] have the medical proof id think i was making this all up.\u201d"} +{"output_text": "agh Strand\n\nBinevenagh Strand is a beautiful beach with a long stretch of sand and a small rocky outcrop. It is a popular spot for surfers and is a great place to watch the sunset.\n\nThe drive to Binevenagh Strand from Downhill Strand takes around 25 minutes.\n\nBack to top\n\n32 Binevenagh Strand\n\nBinevenagh Strand is a beautiful beach with a long stretch of sand", "input_text": " Hedges, we cut across the country going east to Port Stewart. This drive takes just over half an hour.\n\nBack to top\n\n29 Port Stewart Strand\n\nPort Stewart, it is one of Northern Ireland\u2019s best beaches with views of Inishowen headland and Mussenden Temple perched atop the cliffs. It has been a family holiday destination for several hundred years treating many a Northern Irish family to a seaside vacation.\n\nThe strand doubled as the coast of Dorne in season five of Game of Thrones when Jaime Lannister was begged by Cersei to bring her daughter Myrcella back to King\u2019s Landing.\n\nIt was also here among the huge sand dunes and shore grasses that Jaime and Bronn found themselves captured by Dornish soldiers on the coast of Sunspear, and where Ellaria and her Sand Snake sisters conspired to start a war with the Lannisters.\n\nThe drive to Downhill Strand from Port Stewart takes around minutes.\n\n30 Downhill Strand\n\nDownhill Strand is a stretch of spotless white sands, overlooked by Mussenden Temple, a replica of the Temple of Vesta in Rome. Popular as a great surfing spot for locals this beach was where Stannis Baratheon rejected the seven old gods of Westeros. He is proclaimed as the champion of the Lord of Light by Melisandre and enters Dragonstone into the War of the Five Kings.\n\nFrom Downhill strand, the drive to Binevenagh will take around 25 minutes in total. I would totally recommend a Guinness stop at Owens in Limavady, which is on the way, and you can check out Door 5, which features the Night King and his followers.\n\nBack to top\n\n31 Bineven"} +{"output_text": "\ufffd\u0337\u0322\u0322\u0315\u035c\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\u0361\ufffd", "input_text": ", Matthews stopped. The sound of a small electric motor drifted into the air, and dread crept into his heart once more. Slowly turning around, he saw the Bag Man on a moped, barreling towards them and the exit.\n\n\"GET DOWN!\", Matthews yelled out, diving to one side of the exit. Bradford complied, taking the other side. As she leveled her gun at the moped\u2026\n\n\"NOT TODAY, COPPERS!\"\n\nThe Bag Man pulled out the toy rifle and fired at her, all in an impossibly fast motion (and while maintaining control of the moped). The shot failed to hit her directly, but the resulting shock-wave knocked her off her feet and into a wall. Bag Man's maniacal laugh petered out as he scooted away, two more blasts signalling the destruction of the remaining checkpoints.\n\nMatthews ran over to Bradford, quickly checking for a pulse. Blood leaked from her ears and mouth, but there was still a heartbeat, albeit a shaky one. Coughing signaled her return to consciousness, more blood spilling out as she did so. After a few pained breaths, Matthews felt her telepathy kick in:\n\n\"w\u0338\u0334\u0337\u0358\u00e8\u0300\u0315\u0315\u0362 \u0315n\u0338\u0315\u0358e\u0335\u0322\u0322e\u031b\u0315d\u0321\u034f\u031b\u0300 \u0300\u0315\u0362\u035dt\u0336\u035co\u0335\u034f \u0322\u0301\u035d\u0360f\u0321\u035c\u035di\u0334\u0315\u035c\u035d\u0144\u0315\u035c\u0362d\u0358\u035f\u0360 \u0335\u0321\u0300\u0358\u035dh\u0337\u035d\u0360\u034f\ufffd"} +{"output_text": " and the resulting design was scaled down to a length of 1,800 feet (550m) and a displacement of 50,000 tons. The resulting design was the Yamato class, which was to be the largest battleship ever built. The design was approved by the Navy Ministry in December 1934 and construction began in the summer of 1936. The ship was launched on the 30th of July 1938 and commissioned on the 15th of April 1939. The ship was named after the ancient capital of Japan and", "input_text": " submarine attack and the hit on Yamato was simply one more example to add to many others. What was far more significant for the leaders of the Imperial Japanese Navy was the extent of the damage. The explosion had occurred on the starboard side beneath the after l8in main triple gun turret. Quoting from her Captains report to the Navy Ministry the damage was as follows:\n\n\n\n\"A hole about l6ft (5m) deep extending downwards from the\n\n\n\nMIDSHIP SECTION OF YAMATO CLASS BATTLESHIP top of the bulge connection and S2ft (25m) in length between frames 151 and 173. Water flooded into No 3 turret upper magazine from a small hole in the longitudinal bulkhead caused by caving in of the waterline armor\"\n\n\n\nPut simply, her underwater defenses had been breached by a single torpedo and she had shipped over 3000 tons of water, something which her designers had worked assiduously to avoid. The resultant concern deepened when it was learnt that the torpedo had been running shallow and had struck only four\n\n\n\nSECOND HOLD DECK PLAN feet below the surface, where the explosive effect, which increases with the depth of water, had not been particularly great. What is interesting is how and why the design failed on a ship which the Japanese had always intended to he the pinnacle of battleship excellence and one certainly capable of stopping a single torpedo. From the time the, Japanese Naval General Staff ordered the Bureau of Naval Construction to study such a proposal in the autumn of 1934 it was clear the vessel would he enormous since the only sure way of building in superiority in the three key elements of speed, firepower and protection, was to increase the size of the vessel. The first calculations of her principal designers, Yuzura Hiraga and Keiji Fukuda, proved too ambitious"} +{"output_text": "\nThe customers' lawyers say that the trustee's formula is flawed because it does not take into account the fact that the customers' money was used to pay off earlier investors.\n\n\"The trustee's formula is flawed because it does not take into account the fact that the customers' money was used to pay off earlier investors,\" the lawsuit said. \"The trustee's formula is flawed because it does not take into account the fact that the customers' money was used to pay off earlier investors.\"\n", "input_text": ", Ill., which had about $214 million in assets and $202 million in deposits as of May 26.\n\nAll of Bank of Lincolnwood's deposits will be assumed by Republic Bank of Chicago, based in Oak Brook, Ill., which also agreed to buy about $162 million of the bank's assets; the FDIC will retain the rest for eventual sale. Bank of Lincolnwood's two offices will reopen on Saturday as branches of Republic Bank of Chicago.\n\nThe FDIC estimates that the cost to the deposit insurance fund from the failure of Bank of Lincolnwood will be $83 million.\n\nIn a step that would substantially increase the price tag for Bernard L. Madoff's long-running Ponzi scheme, lawyers for a group of his victims are asking a federal bankruptcy judge to reject the way their losses in the fraud are being calculated.\n\nThe customers insist that, by law, they should be given credit for the full value of the securities shown on the last account statements they received before Madoff's arrest in mid-December, even though the statements were bogus and none of the trades were ever made. According to court filings, those account balances add up to more than $64 billion.\n\nAfter months of private negotiations and Internet arguments, lawyers for these customers formally put the issue before the federal bankruptcy court in New York in a lawsuit filed late Friday evening, less than a month before the deadline for filing claims for compensation.\n\nThe approach they seek would produce a significantly higher tally of cash losses than the formula being used by the court-appointed trustee overseeing the claims process for the Securities Investor Protection Corp., a government-chartered agency financed by the brokerage industry.\n\nThe trustee, Irving H. Picard, is calculating investor losses as the difference between the total amount a customer paid into the scheme and the total amount withdrawn before it collapsed.\n"} +{"output_text": "ing format. Bower has a good interop story, but does this on the authoring side and mandates a consumer format.\n\nThe only package manager that I\u2019m aware of that has a good interop story is npm. But npm is a package manager for JavaScript, not Java.\n\nThe problem\n\nThe problem is that JavaScripters are not Java developers. They\u2019re not even Java developers who are comfortable with Java. They\u2019re not even Java developers who", "input_text": ".\n\nThe extras\n\nWhile you\u2019re at it solve CSS, image, and font dependency problems (each of which could use their own 5000 word exploration) and include a working lock file mechanism. I would also like a unicorn.\n\nAbandon All Hope\n\nRight now you\u2019re probably thinking: this kind of behavior doesn\u2019t belong in a package management system. And you\u2019re absolutely right. I can\u2019t think of any other package manager off the top of my head that concerns itself, in anyway, with how fine-grained dependency resolution occurs or how a library\u2019s exposes its public API.\n\nThis behavior belongs in the language proper. But that\u2019s not the world we have.\n\nAfter five years of community experimentation I\u2019m pretty convinced this problem won\u2019t go away until we all start using the same module format for both authoring and consuming[3] or some central broker makes publishing and consuming formats frictionless. Does this belong in a package manager? No. But it\u2019s the best place we\u2019ve got.\n\nWoe unto we unlucky JavaScripters: No package manager or registry is really solving these problems yet. Bower gets the flat hierarchy right, but has no convention around project structure so everyone is writing their own shim code to get files from bower_compoents into their projects.\n\nDuo and Bower both let you point to source control for packages, but source files aren\u2019t builds so authors have to either create new source repos that are really just builds or include build files in their source. It\u2019s a headache for authors and since the pattern isn\u2019t consistent across all projects, it\u2019s a headache for consumers.\n\nDuo has a good transform-based interop story, but does this on the consumer side and mandates an author"} +{"output_text": " library for D. It is a part of the Dlang.org project. It is a library for creating GUI applications in D. It is a part of the Dlang.org project. It is a library for creating GUI applications in D. It is a part of the Dlang.org project. It is a library for creating GUI applications in D. It is a part of the Dlang.org project. It is a library for creating GUI applications in D. It is a part", "input_text": " from hardcoded paths (suitable for Ubuntu). > TODO: add fontconfig support to access all available system fonts. > > Helloworld: > > // main.d > import dlangui.all; > mixin DLANGUI_ENTRY_POINT; > > /// entry point for dlangui based application > extern (C) int UIAppMain(string[] args) { > // resource directory search paths > string[] resourceDirs = [ > appendPath(exePath, \"../res/\"), // for Visual D and DUB builds > appendPath(exePath, \"../../res/\") // for Mono-D builds > ]; > > // setup resource directories - will use only existing directories > Platform.instance.resourceDirs = resourceDirs; > // select translation file - for english language > Platform.instance.uiLanguage = \"en\"; > // load theme from file \"theme_default.xml\" > Platform.instance.uiTheme = \"theme_default\"; > > // create window > Window window = Platform. instance. createW indow(\"My Window\", null); > // create some widget to show in window > window.mainWidget = (new Button()).text(\"Hello world\"d).textColor(0xFF0000); // red text > // show window > window.show(); > // run message loop > return Platform. instance. enterMess ageLoop(); > } > > DDOC generated documentation can be found there: > For more info see readme and example1 code. > > I would be glad to see any feedback. > Can this project be useful for someone? What features/widgets are must have for you? > > > Best regards, > Vadim > Hello!I would like to announce my project, DlangUI library - cross-platform GUI"} +{"output_text": " company spent.\n\nAD\n\nAD\n\nThe company has also pledged to create 25,000 jobs in the city over the next 10 years, and it has said it will spend $2 billion on construction and infrastructure projects.\n\n\u201cWe\u2019re going to be a good neighbor,\u201d Bezos said. \u201cWe\u2019re going to be a good neighbor to the city of Seattle, and we\u2019re going to be a good neighbor to the state of Washington.\u201d\n\nAmazon has", "input_text": ".. [The announcement] requires some level of coordination and advance notice.\u201d\n\nSharon Bulova (D), chair of the board of supervisors in Fairfax County, Va., where one of the sites is located, has felt the vibe.\n\n\u201cWhat I pick up from residents is enthusiasm,\u201d she said. \u201cThey\u2019re excited about the jobs that Amazon would bring.\u201d\n\nIt\u2019s also anxiety. Months of waiting have not quelled concerns about the potential pressure Amazon could place on the region\u2019s already steep housing prices, congested roads and yawning divide between its wealthy and low-income residents.\n\nWhen Bezos spoke at an Economic Club of Washington event in September, more than a dozen protesters occupied the sidewalk outside and civic groups \u2014 sometimes joined by union activists \u2014 have raised concerns about what the addition of such a fast-growing company would mean for the region\u2019s schools, roads and housing prices.\n\nAD\n\nAD\n\nEven without Amazon, the Metropolitan Washington Council of Governments has estimated that the region needs to add 235,000 housing units by 2025 to keep pace with expected job growth.\n\nAmazon\u2019s arrival could push the goal to around 267,000 by 2026, according to a recent analysis by the Urban Institute. Right now the region is on pace to add only about 170,000 new units by then, and the shortage threatens to exacerbate inequality.\n\n\u201cWhether Amazon comes or not, we have a challenge there,\u201d said Peter Tatian, of the Urban Institute. \u201cThe economic growth that has been going on has been benefiting some people and causing problems for others.\u201d\n\nAmazon says it plans to make $5 billion in capital investments alone in the city it chooses and that its headquarters injected an additional $38 billion into the local economy in Seattle, generating an additional $1.40 for every dollar the"} +{"output_text": " not even clear whether the Liberals will hold a nomination meeting at all.\n\nThe Liberals have also been accused of trying to influence the outcome of the nomination race in the riding of Scarborough Southwest, where former Liberal cabinet minister and current Toronto city councillor Kristyn Wong-Tam is seeking the nomination.\n\nWong-Tam has been a vocal critic of the Tories, and has been a vocal supporter of the Liberals in the past. She has also been a vocal supporter of", "input_text": " guess\".\n\nAnd in his final post, he asked for stories about his brother, presumably in order to expose them to the public. \u200bWhen it comes to bizarre shenanigans over party nominations in the lead-up to the 2018 provincial election, there\u2019s no question the Ontario PC Party takes the cake.\n\nWe\u2019ve documented several examples of irregularities in this space over the past several months, as numerous PC candidates have vied for nominations. Allegations of ballot box stuffing, the leader\u2019s office favoring candidates when it\u2019s pledged to stay neutral, police investigations, lawsuits, meetings inaccessible to the physically handicapped \u2014 the list goes on. The Tories have looked not at all ready for prime time. They\u2019ve even called in PwC to monitor the nomination meetings, although few people seem to know precisely what services the consulting firm is providing.\n\nIt\u2019s also fair to say that this has been a bigger problem for the Tories than for the other parties \u2014 because the PCs have led in the polls for almost three straight years, many people expect them to win the next election, and thus a PC nomination is a more prized thing to have these days.\n\nBut the governing Liberals have done some curious things as well. For example, with former cabinet minister Glen Murray having stepped out of public life, his Toronto Centre seat is now vacant. Since it\u2019s considered one of the safest Liberal seats in the province, you can imagine many potential candidates are interested to enter that contest.\n\nStay up to date! Get Current Affairs & Documentaries email updates in your inbox every morning.\n\nHowever, despite the fact the election is only a little more than seven months away, the Liberal Party\u2019s head office has so far not set a nomination meeting date. And according to several people I\u2019ve spoken to, it\u2019s"} +{"output_text": " of the extent of His atonement. The offer of the gospel is the proclamation of the gospel. The proclamation of the gospel is the proclamation that Christ died for the sins of the whole world. The proclamation of the gospel is the proclamation that Christ died for the sins of all people, and that He died for the sins of all people, and that He died for the sins of all people, and that He died for the sins of all people, and that He died", "input_text": " be saved by the death of Christ has a really unlimited atonement. Evangelicals with an atonement which is unlimited in extent limit the power or efficacy of that atonement to actually save those for whom Christ died. Calvinists limit the extent of the atonement. But both limit the atonement! This is why\u2014by the way\u2014I prefer to describe limited atonement as particular redemption.\n\n(2) Calvinists limit the value of the atonement.\n\nActually, it is Arminians who do this! But it is certainly not Calvinists who limit the value of the atonement. Listen once more to the Canons of Dort:\n\nSECOND HEAD: ARTICLE 3. The death of the Son of God is the only and most perfect sacrifice and satisfaction for sin, and is of infinite worth and value, abundantly sufficient to expiate the sins of the whole world.\n\nThe question debated between Arminians and Calvinists in regard to limited atonement is not, then, how much the atonement is worth or how valuable the redemption price paid by Christ is. The question is for whom was it paid and for whom was atonement made.\n\n(3) Limited Atonement contradicts the free and well-meant offer of the gospel!\n\nArminians make this claim because they rightly conclude that limited atonement means that we Calvinists cannot tell everyone you meet that Christ died for them. If limited atonement is true, then Christ did not die for everyone, and we may not say that He did! This seems a serious issue for the one who assumes that sharing the gospel means telling people that Christ died for them.\n\nThe problem is that the offer of the gospel does not consist in anybody\u2019s view of whom Christ died for, or statement"} +{"output_text": ". The Wallabies and All Blacks played each other in the first match of the tournament. The Wallabies won, but the All Blacks were the better team. The All Blacks have a better combination of speed and power, and the Wallabies are more experienced.\n\nThe All Blacks have a better combination of speed and power, and the Wallabies are more experienced.\n\nThe Wallabies have a better combination of speed and power, and the All", "input_text": ".\n\nUsually, all teams have a big turnout. In years past, that has meant Thursday matches, double and sometimes triple-headers. But with the revised competition, a change has been made that only provides a \u2018soft opening\u2019.\n\nWhy? The International rugby calendar and a priority internal tour for the South African Rugby Union (SARU) means that they must start a week earlier than desired. Therefore, the partner unions have ceded to SARU, as negotiations of rugby fixtures are more complicated than a US Senate budget.\n\nIf fans are confused, the bigger picture will make them happy. Back to 15 teams is a better fit. Reduced to three conferences, it eases the stress and while the new relationships of the African and Australian groups need time to bed-in, consensus is that it is a better product.\n\nWe're starting to feel Super Rugby fever, are you? In the first installment of our season preview we've assessed the chances of each side in the South African conference: https://t.co/BgwOHZfw04 pic.twitter.com/AC1ivYwbWp \u2014 TAB Sport (@TAB_Sport) February 12, 2018\n\nSo while the early start by the five South African conference sides means their SANZAAR partners sit idle, they won\u2019t be relaxing.\n\nAustralian and New Zealand Super Rugby sides sit \u2018in wait\u2019 for African Conference\n\nHaving just completed the Brisbane Global Rugby Tens tournament, the nine Australasian Super Rugby teams each still have targets to reach. Their run under the Queensland sunshine will have done plenty for the sides conditioning. True, a 10 minute half is nowhere near the physical conditioning required for a 40 minute stretch. But, the players had to both \u2018run and think\u2019 in Brisbane. So game fitness will have benefited greatly.\n\nCombinations too played their part"} +{"output_text": " charitable activities.\n\n\"The church is a tax-exempt organisation, but it is not a charitable organisation,\" Patten says. \"It is a business, and it is a business that is not charitable.\"\n\nPatten says the church has a responsibility to its members to pay tax.\n\n\"The church is a business, and it is a business that is not charitable.\"\n\nPatten says the church should be forced to pay tax on its assets.\n\n\"", "input_text": " special entities that can be sued and pledged not to invoke the Ellis defence. But details are sketchy and survivors and their lawyers are wary.\n\nNotably, neither element of the proposed redress scheme would expose the church\u2019s billions of dollars in assets.\n\nEllis proposes the repealing of the historic acts that have allowed the church to immunise its assets from legal action - the key stumbling block in his own case. That way, as in the US and elsewhere, the churches would have to incorporate under general law, making them sueable.\n\nPressure is also growing for church accountability to the wider community.\n\nThe federal government is currently reviewing the Australian Charities and Not-for-Profits Commission, five years after its creation. Tax expert professor Ann O\u2019Connell says the review is an opportunity to scrap the exemption for churches from financial reporting to the regulator.\n\n\"We expect the 50,000 other charities to complete a very simple annual information statement, why do we exclude the religious charities?\" O\u2019Connell says. \"If the local tennis club has to account for it then why shouldn\u2019t a parish church?\"\n\nBoth the Melbourne and the Sydney archdiocese say they support the retention of the charities commission but oppose changes that would require them to lodge annual financial statements.\n\nPeter Johnstone says government departments should now demand that the church properly accounts for its own wealth and how it uses public funds.\n\n\"If I was running the Education Department I would be advising the government to say \u2018If you\u2019re going to get his money, here are the details we need from you to ensure our own public accountability'.\"\n\nThen there is the vexed question of tax.\n\nVictorian upper house MP and Reason Party leader Fiona Patten is finalising a bill seeking to ensure tax exemptions for charities only apply to organisations engaged in"} +{"output_text": "option of the party\u2019s left wing by the Blairite wing.\n\nThe leaked report, which was published by the New Statesman, is a report from the Labour Party\u2019s National Executive Committee (NEC) which was leaked to the New Statesman. The report was written by the Blairite wing of the party, which is led by the former Labour leader, Tony Blair. The report was written in the aftermath of the 2017 general election, which saw the Labour Party lose its majority", "input_text": ", you can rename the.msix to.zip, extract the app and look at it\u2019s contents. You\u2019ll see a registry.dat, VFS and a manifest. Clearly MSIX follows the AppV and AppX Universal package standard.\n\nWhen you launch your MSIX on your test machine, you\u2019ll see you install dialog. It\u2019s not really a wizard since it\u2019s much more simplified than a traditional Windows install wizard. You just install. Since MSIX adheres to a container model. Installs should be streamlined and predictable, as should uninstalls. Uninstall should also always be clean! Hooray.\n\nTo see a Demo of an install and use of my Win32 apps as MSIX, check out the above video demo.\n\nPlease try this out for yourself and provide any and all feedback. As stated this is in a preview state right now. With good feedback, Microsoft can make this tool great. You can provide Feedback by going to the cog icon in the top right corner of the tool and then clicking Feedback. Login with your account and post away. Check out my opinion piece and I may post another complimentary article showing deployment through the Windows Store and Intune (MAYBE \ud83d\ude42 ) Written by: Daniel Xie\n\nTrigger Warning: This article contains disturbing mentions of death threats, ableism, gloating over left leaning Labour activists by labour's establishment.\n\nA leaked report has revealed that the Blairite establishment wing of Labour plotted to undermine Jeremy Corbyn and the party\u2019s electoral chances, during the 2017 general election. The efforts of the Labour establishment to thwart Corbyn and the efforts of various other leftists to realign the party to the left have been well-known. This has been made evident by the coup against Corbyn following the Brexit referendum along with the co-"} +{"output_text": "\n\nStarbucks: Free drink on your birthday\n\nSteak \u2018n Shake: Free dessert\n\nSubway: Free small drink\n\nTaco Bell: Free small drink\n\nTaco John\u2019s: Free small drink\n\nTaco Time: Free small drink\n\nTaco Tico: Free small drink\n\nTaco Bell: Free small drink\n\nTaco John\u2019s: Free small drink\n\nTaco Time: Free small drink\n\nT", "input_text": " choice to enjoy any time during your birthday month\n\nPita Pit (campus area): free 22 oz. Fountain Drink (and free combo upgrade when you join e-club)\n\nPizza Hut: A free dessert with your order.\n\nQdoba Mexican Grill: receive a free entree with the purchase of an entree when you sign up for the e-club.\n\nQuaker Steak & Lube: $5 OFF your check of $25 or more and a free Dessert through the e-club\n\nQuiznos: free cookie plus other coupons through the e-club\n\nRaising Cane\u2019s: Join the Caniac Club and get free food on your birthday and anniversary\n\nRed Lobster: free appetizer; different locations have different promotions through the e-club and My Red Lobster Rewards App\n\nRed Robin: free birthday burger through the Red Robin Royalty program; locations may offer different promotions\n\nRefectory: $20 towards two dinner entrees or $10 towards two bistro dinners\n\nRodizio Grill: Special promotion for your birthday\n\nRuby Tuesday: Free burger or free garden bar entree\n\nRuth\u2019s Chris Steakhouse: Free dessert\n\nSbarro: free slice of pizza via Slice Society program\n\nScrambler Marie\u2019s: free entree on your birthday (or 6 days after) with the MarieClub membership.\n\nSmashBurger: Join the Smashclub email and/or get the app, and get a special birthday offer.\n\nSmokey Bones: Free dessert\n\nSonic Drive-in: free creamslush, tater tots, or medium drink\n\nSpaghetti Warehouse: Free entree and dessert via email club"} +{"output_text": " you if you're looking to borrow a larger amount.\n\nRead our Speedy Cash review\n\n8. CashNetUSA: Low charges on low value loans\n\n(Image credit: CashNetUSA)\n\nCashNetUSA Lower charges but slightly lower loans amounts too BBB rating: NA | Trust Pilot rating: 4.6/5 | States: 14 | Physical stores: 200 | Telephone: Y | Online: Y Visit site Lower repayment charges Online, telephone and physical", "input_text": " Telephone: N | Online: Y Visit site Can improve your credit rating Free financial resources online Program rewards on-time loan repayments Rates more expensive than options outside of payday loan companies\n\nRise Credit states that it looks to work with customers to make repayments affordable and avoid some of the more predatory and exploitative practices of the payday loans industry in general.\n\nHaving been in business since 2014, Rise Credit provides loans ranging in size from $500 to $5,000, and operates in 28 states across the U.S.\n\nRise Credit is a relatively ethical loan company that backs up its products with plenty of resources to help customers manage their finances. Its rates, while still more expensive than other options outside of payday loans, are competitive and its lack of prepayment penalties and pathways to better credit scores are definitely a plus. Key to getting the most from Rise Credit is to follow its rate reduction program and using the educational resources on offer.\n\nRead our Rise Credit review\n\n7. Speedy Cash: Low charges on low value loans\n\n(Image credit: SpeedyCash)\n\nSpeedy Cash Lower charges but slightly lower loans amounts too BBB rating: NA | Trust Pilot rating: 4.6/5 | States: 14 | Physical stores: 200 | Telephone: Y | Online: Y Visit site Lower repayment charges Online, telephone and physical touchpoints Support for customers with low credit No Better Business Bureau rating\n\nSpeedy Cash is actually one of the oldest payday loan providers in this guide, having started in California back in 1997. That carries a weight of trust with it and, if you want to borrow a low amount of money at a low rate of interest, it's well worth a look. This means you can borrow anywhere from $100-500 at a time, which will be of limited use to"} +{"output_text": " the tag name in the cell, and if the tag is in the picked tags array, it is checked with the accessory type.\n\nFinally, modify tableView:didSelectRowAtIndexPath: as follows:\n\n-(void)tableView:(UITableView *)tableView didSelectRowAtIndexPath:(NSIndexPath *)indexPath { Tag *tag = (Tag *)[self.fetchedResultsController objectAtIndexPath:indexPath]; [self.tableView deselectRowAtIndexPath", "input_text": "self.tableView reloadData]; } }\n\nYou ignore a tap on the cancel button whereas you save the new tag if \u201cOK\u201d is tapped. In such a case, instead of implementing the change protocols to the table, you fetch the result again and reload the table view for the sake of simplicity.\n\nNext replace the placeholders for numberOfSectionsInTableView and tableView:numberOfRowsInSection with the following:\n\n-(NSInteger)numberOfSectionsInTableView:(UITableView *)tableView { return 1; } -(NSInteger)tableView:(UITableView *)tableView numberOfRowsInSection:(NSInteger)section { id sectionInfo = [[self.fetchedResultsController sections] objectAtIndex:section]; return [sectionInfo numberOfObjects]; }\n\nThis is pretty straightforward \u2013 there is only one section, and the number of rows is calculated according to the results controller.\n\nNext, modify tableView:cellForRowAtIndexPath: as follows:\n\n-(UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath { static NSString *CellIdentifier = @\"TagCell\"; UITableViewCell *cell = [tableView dequeueReusableCellWithIdentifier:CellIdentifier]; if (cell == nil) { cell = [[UITableViewCell alloc] initWithStyle:UITableViewCellStyleDefault reuseIdentifier:CellIdentifier]; } cell.accessoryType = UITableViewCellAccessoryNone; Tag *tag = (Tag *)[self.fetchedResultsController objectAtIndexPath:indexPath]; if ([pickedTags containsObject:tag]) { cell.accessoryType = UITableViewCellAccessoryCheckmark; } cell.textLabel.text = tag.name; return cell; }\n\nThis shows"} +{"output_text": " to plunder the nation\u2019s wealth.\n\nThe US is not a friend of India, but it is a friend of the Indian elite. It is a friend of the Indian elite because it is a friend of the Indian elite\u2019s corporate interests. It is a friend of the Indian elite\u2019s corporate interests because it is a friend of the Indian elite\u2019s corporate interests.\n\nThe Indian elite\u2019s corporate interests are not the same as the Indian people\u2019s interests. The", "input_text": "euvering it across the globe. While China lost ground in Libya, it is loathe to do so in the much more strategically important countries of Syria, Iran and Pakistan.\n\nAs far as India is concerned, the US regards it as a key pawn in its geo-political aims by containing China and not as some equal, autonomous partner in a mythological multi-polar world, despite what many in the Indian media may like to think.\n\nWith this in mind, it is always revealing to see how the Indian media reacts when a high-ranking US politician visits its shores. Much of it turns sycophant. It happened when Obama visited in 2010, and it occurred again earlier this year as Hillary Clinton touched down in Kolkata for a three day visit to India. Media people hung on Clinton\u2019s every utterance, looking for the odd phrase that, in their eyes, confirmed India as the great global power.\n\nAccording to many of the news anchors and columnists, Clinton\u2019s decision to honour India with her presence implied that \u2018we\u2019 really matter \u2013 India as the US\u2019s bilateral partner, engaged in forging an important strategic relationship for the century ahead.\n\nIt\u2019s a strange love affair, however; not a match made in heaven, but in a fool\u2019s paradise. The US is pressurising India to reduce its imports of Iranian oil and to open up it economy further to its powerful corporate players, not least foreign direct investment in the retail sector. Economic growth in India is hitting the buffers, sovereignty is being ceded as foreign interests gain control, the poverty alleviation rate is as low as it was 20 years ago and the US-led \u2018globalisation\u2019 project has led to maximal gains for a minority but minimal gains for the great mass of ordinary folk, while causing great turmoil as state-corporate players gain free rein"} +{"output_text": " }\n\n\n\n);\n\n} }\n\nThe complete function is passed a reference to the ListItemComponent component and calls the complete method on each dependency.\n\nclass ListItemSubscriber extends React.Component { static defaultState() {\n\nreturn {\n\nlistItem: null,\n\n};\n\n} componentDidMount() {\n\nthis.props.list", "input_text": " state. To put everything together:\n\nconst listItem = new ListItemModel({\n\nid: '1',\n\ndescription: 'Teach mom how to use Slack'\n\n}); const ListItemSubscriber = ReactAxiom.subscribe(ListItemComponent); ReactDOM.render(\n\n,\n\ndocument.getElementById('app')\n\n);\n\nThe higher order subscribe function wraps the ListItemComponent and returns a new ListItemSubscriber component. The ListItemSubscriber component will then subscribe to the listItem model and update itself if state changes. In the specific above example, clicking on the complete button will cause the button to disappear.\n\nGetting Complex with References\n\nState changes in React Axiom models occur through mutations. As a result, this design allows references to other objects, arrays, and models to operate fairly well in state. The following example adds a dependencies field to ListItemModel and some additional logic to complete a list item.\n\nclass ListItemComponent extends ReactAxiom.Model { static defaultState() {\n\nreturn {\n\nid: null,\n\ndescription: '',\n\ncompleted: false,\n\ndependencies: [],\n\n};\n\n} complete() {\n\nthis.getDependencies().forEach(dependency => {\n\ndependency.complete();\n\n}); this.setCompleted(true);\n\n} }\n\nTo recursively render list items, the ListItemComponent component can render the ListItemSubscriber component for each dependency:\n\nclass ListItemComponent extends React.Component { render() {\n\nconst { listItem } = this.props;\n\nreturn (\n\n
  • \n\n{listItem.getDescription()}\n\n{listItem.isCompleted()? null :"} +{"output_text": "\nKarger\u2019s campaign has also been a model of efficiency. He has a staff of six, including a full-time videographer, and he has been able to raise $1.5 million in just over a year. He has also been able to raise money from a wide range of donors, including the American Civil Liberties Union, the Human Rights Campaign, the National Organization for Women, the National Gay and Lesbian Task Force, the National Center for Lesbian Rights, the National", "input_text": " the names of major Prop 8 donors. \u201cI wanted to make it socially unacceptable to take away the rights of a minority,\u201d he explains\u2014to, as it were, push such behavior into the closet. In July 2008, he held his first rally in front of a San Diego hotel owned by Prop 8 donor Doug Manchester, calling for a boycott. It caught on, and soon major clients were moving their meetings away from Manchester\u2019s properties. By last spring the hotel\u2019s new, gay PR guru let it be known that Manchester would be donating $25,000 in cash and up to $100,000 in hotel credits to any LGBT group that applied. (Few takers so far.)\n\nAs the battle over Prop 8 raged, Karger continued to expose donors and work the press. He tipped off the Wall Street Journal about the Mormons\u2019 involvement, and in September 2008 the paper broke the story. And he kept finding new ways to hound his adversaries: In monitoring post-election campaign finance reports, he noticed that the Mormon church was only reporting $2,078 in nonmonetary contributions to the Prop 8 effort. That didn\u2019t square given that the church had mobilized a huge number of volunteers (many of them former missionaries with ample door-knocking experience), brought in busloads of supporters from Utah, arranged satellite broadcasts of church leaders, and produced a host of slick ads plus a top-notch website.\n\nKarger filed a formal complaint with the California Fair Political Practices Commission, a move that prompted a spokesman to claim that the church had spent \u201czero dollars\u201d on Prop 8. Two months later the church filed a new report saying it had given $190,000 worth of nonmonetary contributions in the few days before the election (after the filing deadline for the earlier report). California election officials are continuing to investigate.\n"} +{"output_text": " ISI\n\nFaludi, Susan. 1991. Backlash: The Undeclared War Against American Women. New York : Crown.\n\nGoogle Scholar\n\nFaludi, Susan. 1999. Stiffed: The Betrayal of the American Man. New York : St. Martin\u2019s Press.\n\nGoogle Scholar\n\nFaludi, Susan. 2003. The Terror Dream: Fear and Fantasy in Post-9/11 America. New York : Metropolitan Books", "input_text": "An Army of Lovers Cannot Fail.\u201d New York : Routledge.\n\nGoogle Scholar\n\nConrad, Peter. 2007. The Medicalization of Society. Baltimore, MD : Johns Hopkins University Press.\n\nGoogle Scholar\n\nCory, Donald Webster. 1951. The Homosexual in America: A Subjective Approach. New York : Greenberg.\n\nGoogle Scholar\n\nDavidson, Lucy, Linnoila, Markku eds. 1989. Secretary\u2019s Task Force Report on Youth Suicide. Vol. 2: Risk Factors for Youth Suicide. Rockville, MD : U.S. Department of Health & Human Services.\n\nGoogle Scholar\n\nD\u2019Emilio, John. 1983. Sexual Politics, Sexual Communities: The Making of a Homosexual Minority in the United States. 1940-1970. Chicago : University of Chicago Press.\n\nGoogle Scholar\n\nDorais, Michel. 2004. Dead Boys Can\u2019t Dance: Sexual Orientation, Masculinity, and Suicide. Montreal : McGill-Queen\u2019s University Press.\n\nGoogle Scholar\n\nDunwoody, Sharon, Peters, Hans Peter. 1992. \u201cMass Media Coverage of Technological and Environmental Risks.\u201d Public Understanding of Science 1: 199 \u2013 230.\n\nGoogle Scholar SAGE Journals\n\nDuRant, Robert H., Krowchuck, Daniel P., Sinal, Sara H. 1998. \u201cVictimization, Use of Violence, and Drug Use at School Among Male Adolescents who Engage in Same-Sex Sexual Behavior.\u201d Journal of Pediatrics 132: 113 \u2013 18.\n\nGoogle Scholar Crossref | ISI\n\nEspeland, Wendy, Stevens, Mitchell. 2008. \u201cA Sociology of Quantification.\u201d European Journal of Sociology 49: 401 \u2013 36.\n\nGoogle Scholar Crossref |"} +{"output_text": ".\n\n\"The fans are not back ordered, they are not working,\" Baker said. \"The Navy has been working with the contractor to address the issue.\"\n\nThe Navy is working with the contractor to address the issue, Baker said.\n\n\"The Navy is working with the contractor to address the issue,\" Baker said. \"The contractor has been working with the Navy to address the issue.\"\n\nThe Navy is working with the contractor to address the issue, Baker said.\n", "input_text": " than their current one and they elected not to move. The other downside is that many of the other barracks rooms are shared, while the ones in the hot barracks are single rooms.\n\nBaker said this is common as many sailors are reluctant to give up the private rooms in the impacted barracks, which are the only ones on base that offer such privacy. She added that sailors suffering from the heat can get fans from the Navy and can buy their own air conditioners.\n\n\n\n\n\n\"Anyone who purchased a window unit may keep it in the barracks, provided the window design can support,\" Baker said. \"However, this is discouraged and personnel should work with the housing office to address their concerns and can be relocated to avoid out of pocket expenses.\"\n\nThe five barracks buildings were built in 1999 and offer each their a dormitory-like set-up, with private bedrooms and a shared common space with a roommate. However, the building's old and over-taxed air conditioning can't keep up when it gets really hot.\n\n\n\n\n\n\"When the buildings were constructed, the HVAC system installed was a ground source heat pump.\" Baker said. \"This design meets 95 percent of the requirements, but during extreme summer heat conditions like we experienced this summer, it is insufficient to maintain the [Navy] mandated 78 degree thermostat setting.\"\n\n\n\n\n\n\n\nThe system was designed to handle up to 91 degrees and keep the barracks at the mandated 78-degrees in sailor's rooms during summer, she said. But at nearly 20-years old they can no longer handle the load \u2014 especially when the heat climbs over 91 degrees.\n\n\n\n\n\n\"The 440-series Green Top barracks were built in 1999, they are in need of renovation,\" Baker said.\n\nOne sailor said the housing office told told them the fans were back ordered and that they didn't work"} +{"output_text": ", and began to struggle to get out of the sticky mess. He was able to get out of the mess, but not before he had to endure the stench of the fart for a good ten minutes.\n\n\"I'm sorry, Angel. I didn't mean to do that. I was just trying to get you to fart in my face, so I could see if you could handle it.\"\n\n\"I can't believe you did that to me, Fluttershy.", "input_text": " the eyes.\n\n\"I'm impressed, Angel. You've done such a good job at enduring mommy's farts. I can understand that you could survive Rainbow Dash's farts - they weren't bad at all, but my farts? I'm surprised you're still breathing!\"\n\nFluttershy closed her eyes, and eeked a burp through her lips, directed at angel. The thick stench of garlic and onion caused him to lazily struggle in her hoof, much to his avail.\n\nShe then put Angel to the other end of her, pressing him in-between her plump buttocks. Fluttershy then glanced over to Rainbow Dash, a wicked and mischievous expression on her face. \"You might want to hold your nose.\" Was all\n\nshe said before opening up in a torrent of hell in gas form. The explosive, blaring flatus raged rigorously against Angel's face. The fart became suddenly sticky and juicy, garbling and bubbling thick,unadulterated bubbles of\n\ntoxic stench against the poor Rabbit. The malodorous fog spread again through the room, leaving not a modicum of fresh air in its wake.\n\nRainbow was sure to block her nostrils and mouth, refusing to let any of the fetid breeze enter her body, and backed away further from Fluttershy, to lessen the chance of having to breathe any in\n\nThe fart spluttered and bubbled on for another full minute, not showing any sign of stopping or getting any less wet.\n\n\"Do you like that, Angel?\" Fluttershy cried, taking Angel's head and smearing it around her unwashed perinial body, covering his white coat in a brown stain of feces.\n\nIt was at this point that Angel gave up"} +{"output_text": "ating message to the West. The Russian government is trying to send a message that it is not going to be pushed around. The Russian government is trying to send a message that it is not going to be pushed around.\n\nThe Russian government is trying to send a message that it is not going to be pushed around.\n\nThe Russian government is trying to send a message that it is not going to be pushed around.\n\nThe Russian government is trying to send a message that it is", "input_text": "uth, \u201cover, under, around or through find a way, or make a way\u201d. Making a way sometimes means making it make sense for your own context.\n\nDon\u2019t misunderstand me\n\nThere is over 30 years of research that says inclusive education is better for everyone. My point is that admitting that inclusion is not available to everyone is not the same thing as saying that is not possible or the right thing to do. I suppose this is why I continue to be the best self-contained classroom teacher I can be. I hope that you can live out inclusive practices in your context. On Dec. 12, a Russian military jet came dangerously close to a Scandinavian Airlines passenger plane in international airspace near southern Sweden. Reportedly, the Russian aircraft was flying without its transponder active when the Swedish military detected it. The Swedes notified civilian air traffic control, which then diverted the civilian jet. A collision was avoided.\n\nImmediately after the December incident, the Russians denied that their aircraft was anywhere near the passenger jet. But the near miss in the skies over Scandinavia was only the latest incident in a consistent pattern of Russian provocations and \u201cwho-me?\u201d denials. In March 2014, a Russian reconnaissance aircraft came close enough to an SAS airliner departing from Copenhagen to require the airliner\u2014carrying more than 100 passengers\u2014to maneuver to avoid a collision.\n\nFor years, Russian aircraft have been doing fly-bys of European neighbors, largely without much public notice. But as Russia's relations with the United States and Europe have deteriorated in recent months following Moscow's annexation of Crimea and support for the rebels in eastern Ukraine, these incidents in the skies seem to have taken on a new urgency\u2014they may even herald a revival of Cold War-era tactics.\n\nMoscow's aggressive behavior is intended as an intimid"} +{"output_text": " in the city of Ayutthaya.\n\nThe king was very fond of Toh and gave him a lot of money to support his studies. He also gave him a lot of land to farm. Toh was very successful in his studies and became a very learned monk. He was very popular among the people and was known as \"the monk who knows everything.\"\n\nOne day, the king asked him, \"What is the meaning of the word 'samsara'?\"\n\n", "input_text": " over the surface of the water. Then she handed the bowl to him as he was sitting on his horse. He took one long look at the stamens on top of the water and then had to drink the water very carefully so as not to swallow them. As he handed the bowl back to her, he asked her, \"Was that a trick?\"\n\n\"No,\" she said. \"I saw that you were so thirsty that you might gulp the water down and end up choking on it. So I figured this would be a good way to make sure that you drank slowly.\"\n\nWell. He asked her, \"Are your parents around?\" So she fetched her parents. They didn't know who he was, but he was obviously a nobleman, so when he told them, \"I'd like to have your daughter,\" they gave their consent. So she joined the king in the army camp, but as the campaign was ending he said to her, \"I'm afraid I can't take you down to the palace with me, but in case you do have a child by me, here's my belt. Give the child my belt and I'll know that it's my child. I'll take care of him or her in the future.\" So he left her and went down to Bangkok.\n\nHer whole family soon followed down to Bangkok when they discovered that she actually was pregnant. They moved onto a floating house moored on the bank of the Chao Phraya River in front of a monastery, Wat In. She gave birth to a son and named him Toh, which means \"large.\" When he was old enough, he was ordained as a novice. A few years later, when the nobleman had become King Rama II, the family took Novice Toh to Wat Nibbanaram \u2014 currently Wat Mahathaad, a temple"} +{"output_text": " in 1952 has been safe.\n\nBut that could change.\n\nThe NFL is a passing league, and the league\u2019s best quarterbacks are getting better. The league\u2019s best quarterbacks are getting better, and the league\u2019s best quarterbacks are getting better.\n\nThe league\u2019s best quarterbacks are getting better, and the league\u2019s best quarterbacks are getting better.\n\nThe league\u2019s best quarterbacks are getting better, and the league\u2019s", "input_text": " on his career.\n\nPaul Krause, all-time ball hawking champ\n\nThe drop-off in interceptions in the NFL also means some defensive backs have been immortalized.\n\nKrause finished his Hall of Fame career with 81 career interceptions. That may not sound like a lot, but the only active player in the NFL with more than 36 is 39-year-old Vikings cornerback Terence Newman, who has 42.\n\nIn 16 seasons with Washington and Minnesota, Krause finished eight with at least six interceptions and two with at least 10. For perspective, there isn\u2019t a single active player in the NFL who has topped six interceptions more than twice in his career.\n\nKrause did most of his damage as a free safety before retiring in 1979. His 81 interceptions will almost definitely sit in the record books forever.\n\nNight Train Lane, the rookie sensation of 1952\n\nOf the unbreakable interception records here, Richard \u201cNight Train\u201d Lane\u2019s 14 interceptions in 1952 has stood the longest, but is also the most breakable. Still, the odds are that it\u2019s going to be safe for a very long time.\n\nThe few defensive backs capable of coming close are avoided by quarterbacks. The last time a defensive player even had more than 10 interceptions was in 1981 when Everson Walls had 11.\n\nIf \u2014 and it\u2019s a big if \u2014 a player went on a run and got to 10 interceptions in a season, opposing teams would do whatever possible to avoid him. When Richard Sherman led the NFL with eight interceptions in 2013, he was targeted once every 9.6 times a team dropped back to pass \u2014 or less than four times per game.\n\nAnd so for the last few decades, Lane\u2019s 14 interception record that he set"} +{"output_text": " he was a Tory.\n\nThe Tories were the party of the empire. They were the party of the aristocracy. They were the party of the landed gentry. They were the party of the Church of England. They were the party of the monarchy. They were the party of the aristocracy. They were the party of the landed gentry. They were the party of the Church of England. They were the party of the monarchy. They were the party of the arist", "input_text": " didn\u2019t like it. I think it makes the best sense for story.\u201d\n\nStar Trek: Discovery returns to CBS All Access today, January 7, after a hiatus, but the producers do not yet know if season two will be split in half as well.\n\n\u201cIt\u2019s too far in the future for us to even know,\u201d Berg said. \u201cWe\u2019re just starting again in season two.\u201d MYTH: it is the great enabler, the unifier and the falsifier. Truth hammers in vain against the might of myth and all its panoply, like a mouse on a castle wall.\n\nBritain this year celebrates the triumphs of 1940 and the Battle of Britain, even though 70 years ago last Sunday, Hitler all but decided against going ahead with an invasion of Britain. On that day, July 31, 1940, the German naval chief Raeder told the Fuehrer that such plans as existed to invade Britain were playing havoc with German commerce, as barges and fishing vessels were being conscripted from all over the Reich. But to no purpose, because tides allowed only two possible windows. One was August 20-26 and the other was September 19-26. The first was too soon and the second was traditionally one of bad weather. Anyway, the German navy was ready for neither.\n\nHitler agreed. He would continue the pretence of an imminent invasion, with an aerial assault designed to bring Britain to its senses and persuade it to negotiate.\n\n\"We have no business destroying Britain,\" he declared. \"We are quite incapable of taking up its legacy.\"\n\n\"Its legacy\", of course, was the empire. And who was the greatest imperialist of all? Who agreed with Hitler's respectful views of it? Churchill. And why was Churchill in a political wilderness through the 1930s? Because"} +{"output_text": "git clone https://github.com/mattdesl/haskell-game-of-life.git git checkout -b add-random-event\n\nThe code is also available on github.\n\nThe next step is to add a new network. We will use the same approach as before, but this time we will use a different network.\n\nThe new network is a bit more complicated, because it is not just a simple list of cells, but a list of lists", "input_text": "newExternalEvent _ <- RH.performPostBuild $ do -- added RH.performPostBuild $ liftIO $ forkIO $ forever $ do -- added voidliftIOforkIOforever _ <- decayT () -- added decayT () =<< randomRIO ( 3000000, 10000000 ) -- added threadDelayrandomRIO ( let foodChange = R.leftmost foodChangeR.leftmost $> \\fs -> genNewFoodM fs [ startEvent\\fsgenNewFoodM fs $> \\fs -> genNewFoodM fs, restartEvent\\fsgenNewFoodM fs <&> \\snake fs -> pure (genNewFood fs snake), R.updated snakeDyn\\snake fs(genNewFood fs snake) $> \\fs -> genNewFoodM $ tail fs -- added, foodDecayEvent\\fsgenNewFoodMfs ] <- liftIO infiniteFoodSupplyliftIO [ zipWith V2 x y x y | x <- newStdGen <&> randomRs ( 1, width) newStdGenrandomRs (, width) <- newStdGen <&> randomRs ( 1, height), ynewStdGenrandomRs (, height) ] allTheFood :: R.Dynamic t [ Coord ] <- R.foldDynM id t [R.foldDynM infiniteFoodSupply foodChange pure $ head <$> allTheFood allTheFood\n\nBecause the different networks are decoupled, it is no problem adding a new event locally - only 6 added lines of code. We have a new event that fires randomly every 3-10sec. The way it is set up the timer is not reset when food is eaten or when the game resets, but implementing that would not make things less local.\n\nThe corresponding code is available in a branch, i.e.\n\n"} +{"output_text": " would be willing to sign up to that?\n\nThe EBU has been in the news a lot recently, and not for the best of reasons. The organisation has been accused of being a bit of a bully, and of being a bit of a bully in the past.\n\nThe EBU has been accused of being a bully in the past.\n\nThe EBU has been accused of being a bully in the past.\n\nThe EBU has been accused of being a bully", "input_text": " were being accepted from Monday 19th October and the deadline is Friday 13th November. From announcement to curtain close \u2013 four weeks.\n\nFour weeks would be a pretty swift turnaround for a contract of that magnitude at the best of times, but \u2013 either by unfortunate coincidence or sneaky design \u2013 this has all come at a rather hectic point on the Eurovision calendar. Because anyone who is well acquainted with the internal workings of the Eurovision Family of Events (i.e. the sort of person that would be perfect to put in a bid) will be hard at work putting on the Junior Eurovision Song Contest.\n\nWhen is Junior ESC? Saturday 21st November. The following weekend.\n\nBut, wait. There\u2019s more. Tucked away in the terms and conditions of the tender is the stipulation that any company which does decide to bid can\u2019t publicly reveal that they\u2019ve submitted a bid \u2013 even if they are unsuccessful \u2013 unless they get the express written permission of the EBU to do so.\n\nSo much for being open and transparent.\n\nAnd worse than that, anything that a company does submit as part of the tender process becomes the exclusive property of the EBU. In the terms and conditions of making a bid, bidders agree that \u201cthe EBU shall be entitled to use (free from any payment or restriction) all ideas, concepts, proposals, recommendations or other materials (save for the bidder\u2019s trademarks and third parties\u2019 copyrighted materials as communicated in writing by the bidder to the EBU)\u201d\n\nAll of which appears to mean that the EBU can take your ideas, not pay you for them, and you aren\u2019t able to speak up about it because you aren\u2019t allowed to reveal that you submitted a bid.\n\nIt might just be clumsily worded, but what sort of man"} +{"output_text": " say, it will be a major blow to the ability of minority communities to participate in the political process.\n\n\u201cThe census is the most important tool we have to ensure that we have a fair and accurate count of the people in our country,\u201d said Kristen Clarke, president and executive director of the Lawyers\u2019 Committee for Civil Rights Under Law, which has filed a brief in the census case. \u201cIf the census is not accurate, it will have a devastating impact on the ability of communities", "input_text": " report said. Influenced by this evidence, a jury sentenced Buck to death.\n\nBuck appealed to the Supreme Court, alleging that his Sixth Amendment right to effective counsel had been violated. In 2017, the Supreme Court reversed the death penalty conviction in a 6-2 decision, with Chief Justice John Roberts writing for the majority.\n\n\u201cThere is a reasonable probability that Buck was sentenced to death in part because of his race,\u201d Roberts wrote. \u201cThis is a disturbing departure from the basic premise that our criminal law punishes people for what they do, not who they are. That it concerned race amplifies the problem. Relying on race to impose a criminal sanction \u2018poisons public confidence\u2019 in the judicial process.\u201d\n\nWith the Supreme Court set to rule in the coming days on the legality of the Trump administration\u2019s addition of a citizenship question to the 2020 census, all eyes are on Roberts, now the court\u2019s ostensible swing justice after the retirement of Anthony Kennedy. Many are wondering which Roberts we\u2019ll see\u2014the pragmatic justice who saved the Affordable Care Act in 2012 and has occasionally, as in the Buck case, sought to redress egregious instances of racial discrimination? Or the hard-edged conservative who wrote the majority opinion gutting the Voting Rights Act in 2013 and has spent much of his career trying to roll back civil rights for racial and ethnic minorities? His ruling in the census case will determine Roberts\u2019 legacy\u2014and that of the court he presides over\u2014for many years to come.\n\nAt stake is whether Roberts will allow the Trump administration to rig American politics for the next decade by corrupting the census, and turn the Supreme Court\u2014which he has often said should stay above partisan politics\u2014into an unmistakable ally of the Republican Party and white power. If the court upholds the administration\u2019s addition of the question, experts"} +{"output_text": "OUS! I am a 3 year old, female, domestic shorthair. I am a very sweet girl who loves to be held and cuddled. I am a very loving girl who loves to be petted and scratched. I am a very playful girl who loves to play with toys. I am a very loving girl who loves to be petted and scratched. I am a very playful girl who loves to play with toys. I am a very loving girl who loves to be petted and scratched", "input_text": " 1 Special Teams: 0 Offense: 2 Defense: 2 Special Teams: 0 Oregon State Offense: 0 Defense: 0 Special Teams: 0 Offense: 1 Defense: 0 Special Teams: 0 Offense: 0 Defense: 2 Special Teams: 1 Offense: 1 Defense: 0 Special Teams: 0 Stanford Offense: 0 Defense: 2 Special Teams: 0 Offense: 2 Defense: 2 Special Teams: 1 Offense: 2 Defense: 1 Special Teams: 1 Offense: 1 Defense: 0 Special Teams: 0 UCLA Offense: 1 Defense: 0 Special Teams: 0 Offense: 0 Defense: 2 Special Teams: 0 Offense: 0 Defense: 1 Special Teams: 2 Offense: 2 Defense: 0 Special Teams: 1 USC Offense: 1 Defense: 2 Special Teams: 0 Offense: 3 Defense: 1 Special Teams: 0 Offense: 1 Defense: 1 Special Teams: 0 Offense: 1 Defense: 1 Special Teams: 0 Utah Offense: 0 Defense: 2 Special Teams: 1 Offense: 0 Defense: 1 Special Teams: 0 Offense: 1 Defense: 1 Special Teams: 0 Offense: 0 Defense: 1 Special Teams: 0 Washington Offense: 4 Defense: 3 Special Teams: 1 Offense: 1 Defense: 2 Special Teams: 0 Offense: 1 Defense: 0 Special Teams: 0 Offense: 0 Defense: 1 Special Teams: 0 Washington State Offense: 1 Defense: 1 Special Teams: 0 Offense: 1 Defense: 0 Special Teams: 0 Offense: 2 Defense: 2 Special Teams: 0 Offense: 2 Defense: 1 Special Teams: 1\n\n* Carrington was dismissed from the team in July. Black Beauty has found her forever home. Congratulations Beauty!\n\nHi, My name is Black Beauty!\n\nAs you can tell from my name, I am absolutely GORGE"} +{"output_text": " UFC.\n\n\u201cI\u2019ve been in the UFC for a while now and I\u2019ve had a lot of success,\u201d Ray told. \u201cI\u2019ve had a lot of wins and I\u2019ve had a lot of losses. I\u2019ve had a lot of fights that I\u2019ve won and a lot of fights that I\u2019ve lost. I\u2019ve had a lot of fights that I\u2019ve won and a lot of fights that I\u2019ve lost.\n\n\u201cI", "input_text": "Higher Level\u2019 appears to be more than just a moniker.\n\n\u201cWe had a really good year,\u201d James told. \u201cTo be honest we\u2019ve outgrown our current facilities and are looking at moving into a new, bigger gym this year. The fight team have had an extremely busy year as well as having some of our guys sign with the UFC. The numbers we\u2019ve accumulated has been fantastic and that\u2019s going to make it easy to find a bigger, better premises to train out of.\n\n\u201cWe had two or three of our amateurs turn pro in 2017 and in 2018, we\u2019ll have a steady stream of guys ready to turn professional as well as many ready to make their amateur debuts. We\u2019ve a strong number of youthful students \u2013 a lot of them being teenagers around seventeen to eighteen years old as well as some learning the fundamentals around fourteen and fifteen.\n\n\u201cWith the ratio like that, it\u2019s only going to help the fight teams get better, bigger and stronger. When you look back to when I was fighting, myself and Paul McVeigh retired at the same time and the fight team never really recovered in the sense that, when two guys are competing at a high level and it suddenly stops, nothing happens for a while.\n\n\u201cThe way our team focuses is different,\u201d Doolan divulged. \u201cI made a point to specialise in MMA. I tailored the gym for MMA \u2013 not other disciplines, so if someone came in looking to do jiu jitsu competitively, I\u2019d recommend them to go to The Griphouse for instance. Our gym is fully MMA and I think it\u2019s healthier that way.\u201d\n\nWith the aforementioned UFC signings, Higher Level\u2019s most pristine combatant Stevie Ray (21-7) has made quite an impression during his run in the"} +{"output_text": "iques. En effet, les d\u00e9veloppeurs de l\u2019\u00e9diteur Funcom ont choisi de faire appel \u00e0 des auteurs comme Howard, qui ont \u00e9t\u00e9 largement diffus\u00e9s dans les ann\u00e9es 1930 et 1940, pour cr\u00e9er un univers qui se veut proche de celui de la mythologie nordique.\n\nCette approche est d\u2019autant plus int\u00e9ressante que la franchise Conan a \u00e9t\u00e9 cr\u00e9\u00e9e en 2008,", "input_text": "\u00e9aire. Au contraire, l\u2019histoire de leur diffusion jusqu\u2019\u00e0 nos jours est faite de multiples ramifications, parfois m\u00eame de retour en arri\u00e8re. Dans mon travail de recherche, j\u2019ai appel\u00e9 ce ph\u00e9nom\u00e8ne un processus de continuit\u00e9 non lin\u00e9aire, pour montrer que s\u2019il est important de consid\u00e9rer la dimension historique des ph\u00e9nom\u00e8nes culturels, il ne faut cependant pas se limiter \u00e0 suivre b\u00eatement la \u00ab fl\u00e8che du temps \u00bb.\n\nL\u2019histoire d\u2019une franchise et la construction du canon\n\nL\u2019\u00e9tude de l\u2019histoire de la r\u00e9ception des sources nous m\u00e8ne jusqu\u2019au XXe si\u00e8cle. Le jeu Age of Conan : Hyborian Adventures (Funcom, 2008) se fonde sur les \u00e9crits de l\u2019auteur texan Robert Ervin Howard publi\u00e9s dans les ann\u00e9es 1930, qui mettent en sc\u00e8ne les aventures de son h\u00e9ros Conan, le fameux barbare que beaucoup connaissent \u00e0 travers l\u2019incarnation d\u2019Arnold Schwarzenegger au cin\u00e9ma. Ce jeu en ligne est donc un produit qui s\u2019inscrit au sein d\u2019une franchise m\u00e9diatique dont les d\u00e9veloppements n\u00e9cessitent de tenir compte des strat\u00e9gies commerciales des entreprises qui les d\u00e9veloppent.\n\nDans ce cadre, la volont\u00e9 de certains acteurs des \u00ab mondes de l\u2019art \u00bb dans lesquels s\u2019inscrivent les productions de la franchise Conan influence la r\u00e9ception des r\u00e9f\u00e9rences aux mythes nord"} +{"output_text": ".\n\nThe Seahawks are the best team in the NFL. They\u2019re the best team in the history of the NFL. They\u2019re the best team in the history of the NFL. They\u2019re the best team in the history of the NFL. They\u2019re the best team in the history of the NFL. They\u2019re the best team in the history of the NFL. They\u2019re the best team in the history of the NFL. They\u2019re the best team in the", "input_text": " SEO expert roundup should hopefully inspire people who are reluctant to realise their real online profits potential. A solid SEO strategy is worth more than a 6-figure marketing budget in a corporate boardroom.\n\nThe majority of big businesses are less agile than DIY SEOs and smaller companies. I\u2019m sure this writeup won\u2019t do much for the people who have already settled on the thought that SEO is dead, however, those who are hungry for success will probably leap into action very soon\u2026 and I\u2019ll have more case studies to write about!\n\nLegend/Technical Jargon: Now about the clever stuff in the brackets. It\u2019s domain authority measurement to try to illustrate the fact that backlinks are a major part of the ranking algorithm (as are various content-related metrics). Domain autohority (DA) is a metric invented by Moz.com. The higher the better. If DA is higher than 20, it\u2019s likely the website\u2019s owner has done some form of link building. TF and CF were introduced by Majestic.com. Trust Flow (TF) is a qualitative measurement and represents authority of the backlinks pointing to a site. Citation Flow (CF) is a quantitative metric. Happy New Year.\n\nI mentioned this in the podcast yesterday but wanted to put it down in words. For me, the thing that was most bothersome about Super Bowl XLIX wasn\u2019t so much the agonising manner of the defeat. It was the way it jeopardised the legacy of this era of Seahawks football.\n\nWin back-to-back Super Bowls and this is the team of the decade. Forever to be remembered as one of the greats. Pete Carroll would be one Super Bowl title behind Bill Belichick\u2019s incredible haul. It wouldn\u2019t matter if they didn\u2019t win it this season"} +{"output_text": "gas que estimulam a secre\u00e7\u00e3o de glicose\n\ndrogas que estimulam a secre\u00e7\u00e3o de insulina\n\ndrogas que estimulam a secre\u00e7\u00e3o de insulina\n\ndrogas que estimulam a secre\u00e7\u00e3o de glicose\n\ndrogas que estimulam a secre\u00e7\u00e3o de glicose\n\ndrogas que estimulam a secre\u00e7\u00e3o de insulina\n\ndrogas que estimulam a secre\u00e7\u00e3o de ins", "input_text": "inais leves\n\ndores de cabe\u00e7a\n\nleve dor abdominal\n\nEm um grau muito menor, alguns tamb\u00e9m notaram:\n\ndor retal\n\nproblemas dent\u00e1rios\n\nmenstrua\u00e7\u00e3o irregular para mulheres\n\num efeito de fadiga recorrente\n\nMaiores informa\u00e7\u00f5es sobre Orlistat (Brasil e Porgugal)\n\nEstes efeitos colaterais n\u00e3o s\u00e3o exaustivos, mas representam a maioria dos efeitos colaterais experimentados por pacientes cujo corpo basicamente n\u00e3o aceitou a intrus\u00e3o do comprimido. Vale a pena notar, novamente, que nem todos os pacientes est\u00e3o sujeitos a isso, e para descobrir se \u00e9 esse o seu caso, n\u00e3o ter\u00e1 outra escolha sen\u00e3o experimentar.\n\nAl\u00e9m disso, se estes efeitos secund\u00e1rios persistirem, ser\u00e1 necess\u00e1rio falar diretamente com o seu m\u00e9dico, que poder\u00e1 aconselhar-lhe um medicamento para perda de peso, mais adequado para o seu organismo.\n\nOrlistat : Intera\u00e7\u00f5es medicamentosas e contra-indica\u00e7\u00f5es\n\nAlguns tratamentos podem entrar em conflito com outros medicamentos e, quimicamente, criar efeitos inesperados; portanto, pondere consultar o seu m\u00e9dico antes de tom\u00e1-lo e inform\u00e1-lo sobre quaisquer doen\u00e7as que possa ter assim como os tratamentos que faz regularmente.\n\nO tratamento para emagrecimento de Orlistat provavelmente colidir\u00e1 com:\n\ndro"} +{"output_text": "ST SIDE STORY\n\nCharles Nemo was born in Chicago, Illinois, on January 1, 1928. He was the son of a Chicago police officer and a homemaker. He was a good student and graduated from high school in 1946. He attended the University of Illinois at Urbana-Champaign and graduated with a degree in journalism in 1950. He worked as a reporter for the Champaign News-Gazette and the Champaign News-Gazette-Telegram", "input_text": " recalled previous witnesses and experts who had testified. The prosecution reminded the jury of the heinous crimes committed by Gacy, talked of his manipulative behavior, his rape and torture of the victims and how his crimes were premeditated and planned. The defense insisted that Gacy was insane and out of control at the time of the killings and pointed to the testimony given by experts during the trial. After the closing arguments and the testimony of over a hundred witnesses over a period of five weeks, the jury was left to make their decision.\n\nIt took only two hours of deliberation before the jury came back with its verdict. The courtroom was filled with silence and everyone within stood at attention when the jury marched in with its verdict. The silence was broken when the court clerk read, \"We, the jury, find the defendant, John Wayne Gacy, guilty...\" Gacy was found guilty in the deaths of thirty-three young men and as Sullivan said, he had the \"singular notoriety of having been convicted of more murders than anyone else in American history.\" Gacy received the death penalty and was sent to Menard Correctional Center where, after years of appeals, he eventually was killed by lethal injection.\n\nBibliography\n\nThis feature story is primarily drawn from the Chicago Tribune and the Chicago Sun-Times, plus the following books:\n\nCahill, Tim, Buried Dreams: Inside the Mind of a Serial Killer (1986).\n\nLinedecker, Clifford L., Man Who Killed Boys (St. Martin's Paperbacks, 1994).\n\nMendenhall, Harlan, Fall of the House of Gacy (Mass Market Paperbacks, 1998).\n\nSullivan, Terry and Peter T. Maiken, Killer Clown (Mass Market Paperback, 1997)\n\nCharles Nemo\n\nJOHNNY, WE"} +{"output_text": " sur le programme de d\u00e9fense F-15SE, qui sera d\u00e9ploy\u00e9 en Europe, et sur le programme de d\u00e9fense F-18SE, qui sera d\u00e9ploy\u00e9 en Asie.\n\n\u2022 Raytheon compte sur le programme de d\u00e9fense Patriot, qui sera d\u00e9ploy\u00e9 en Europe, et sur le programme de d\u00e9fense PAC-3, qui sera d\u00e9ploy\u00e9 en Asie.\n\n\u2022 Northrop Grumman compte sur le programme de d\u00e9f", "input_text": "it, usage de bombes \u00e0 sous-munitions, rapports alarmistes d\u2019Amnesty international, 3 millions de d\u00e9plac\u00e9s, etc), l\u2019ex\u00e9cutif am\u00e9ricain fournit une assistance militaire \u00e0 l\u2019arm\u00e9e saoudienne, et lui a notamment transf\u00e9r\u00e9 plus de 3 milliards de dollars d\u2019armement et de munitions.\n\nDes contrats-mirage?\n\nLire aussi Andrew Cockburn, \u00ab Au Pentagone, la peur pour carburant \u00bb, Le Monde diplomatique, f\u00e9vrier 2017. Pour Bruce Riedel, ancien de la CIA, aujourd\u2019hui consultant \u00e0 la Brookings Institution, l\u2019accord global d\u00e9croch\u00e9 par le pr\u00e9sident Donald Trump est plut\u00f4t \u00e0 ranger au magasin des \u00ab fake news \u00bb (L\u2019Express, 6 juin 2017) : une partie des contrats avaient \u00e9t\u00e9 conclus sous Obama ; d\u2019autres sont le plus souvent des lettres d\u2019int\u00e9r\u00eat ou d\u2019intention, dont rien ne garantit qu\u2019elles d\u00e9boucheront sur des contrats en bonne et due forme.\n\nTous les grands fabricants d\u2019armes am\u00e9ricains sont cependant sur les rangs :\n\n\u2022 Lockheed Martin annonce 28 milliards de commandes, avec son syst\u00e8me de d\u00e9fense antimissile sol-air, des fr\u00e9gates LCS, des missiles. 150 h\u00e9licopt\u00e8res Black Hawk S-70 seront construits sur place, apr\u00e8s la cr\u00e9ation d\u2019une filiale en collaboration avec Taqnia, un op\u00e9rateur saoudien. Des discussions sont en cours sur l\u2019acquisition du syst\u00e8me de d\u00e9fense Thaad.\n\n\u2022 Boeing compte"} +{"output_text": " is the Chief Technology Officer of a company called Netcrave.\n\nNetcrave is a cloud-based software company that provides \u201ca platform for building and deploying applications.\u201d The company\u2019s website describes itself as \u201ca cloud-based platform for building and deploying applications.\u201d\n\nThompson\u2019s LinkedIn profile states that she is the Chief Technology Officer of Netcrave. Thompson\u2019s LinkedIn profile also states that she is a \u201cSoftware Engineer\u201d at Netcrave.\n\nThom", "input_text": " Seattle that lead to the arrest of Paige Thompson, 33yo software engineer accused of hacking databases and stealing info on 100 million credit card applications for #CapitalOne in a major breach. Housemates share details @ Noon @KIRO7Seattle pic.twitter.com/NXsjfAOInn \u2014 Ranji Sinha (@RanjiKIRO7) July 30, 2019\n\nFBI investigators obtained a warrant to search Paige Adele Thompson\u2019s home in Seattle and executed the search on July 29, 2019. According to the criminal complaint, Thompson was present along with five other people. CBS affiliate KIRO-TV in Seattle obtained video, embedded in the tweet above, of the moment agents arrived to search Thompson\u2019s house.\n\nInvestigators seized several digital devices from Thompson\u2019s bedroom. The complaint explained that \u201cduring the initial search of some of these devices, agents observed files and items that referenced Capital One and the Cloud Computing Company, other entities that may have been the targets of attempted or actual network intrusions, and \u201cerratic,\u201d the alias associated with Paige A. Thompson.\u201d\n\nThompson was arrested and charged with computer fraud and abuse. According to Bloomberg, Thompson \u201cbroke down and laid her head down on the defense table\u201d during an arraignment on July 29.\n\nA bond hearing was held on August 1. Federal inmate records show that Thompson remains behind bars at a detention facility in Seattle. Thompson faces up to five years in prison and a $250,000 if convicted.\n\n5. Paige Thompson Calls Herself the Chief Technology Officer of a Company Called Netcrave\n\nPaige Adele Thompson described herself as a \u201cProgrammer, sysadmin, electronics enthusiast\u201d on the GitLab profile mentioned in the third section of this article. Another profile states that she"} +{"output_text": " was a Terminator, and that she was not a Skynet Terminator. She may have been programmed to kill John Connor, and she may have been programmed to kill Kyle Reese. She may have been programmed to kill John Henry, and she may have been programmed to kill Kyle Reese. She may have been programmed to kill John Henry, and she may have been programmed to kill Kyle Reese. She may have been programmed to kill John Henry, and she may have been programmed to", "input_text": " actually working for John Henry or Weaver's faction, as she claimed to be? Was she never actually reprogrammed? A1: Unlikely. Cameron simply seems to be a machine designated to get close to John in order to kill him. Also, there is evidence in \"Vick's Chip\" that Cameron had been reprogrammed by John Connor at some point. A2: Possibly. Catherine Weaver seemed to show a disregard for human life, so this third faction would have no problem killing Allison Young to get close to John, and they would know that John needed Cameron to send her back to the past. It would also explain why Derek was left alive, and also how by the time Derek got back a day or two later, Cameron was already working for the good side.\n\nWas Cameron, who killed Allison Young, actually working for John Henry or Weaver's faction, as she claimed to be? Was she never actually reprogrammed?\n\nQ: Is it possible to believe that these were future John Connor's orders to Cameron? A: It is possible. John Henry may be mix of the powerful AI created by Weaver which matches Skynet and the knowledge in human behavior and life that Cameron learned in the past. Perhaps John Connor jumping over to the future changed all the plans \u2014 thus, either the story of the breakout from Century Camp happened instead with John Henry/Cameron and Kyle and John Henry/Cameron is the leader of the Resistance.\n\nIs it possible to believe that these were future John Connor's orders to Cameron?\n\nQ: Why did Cameron react the way she did when she heard Weaver's offer on this episode? Are there untold issues of mistrust or hostility between Weaver and Cameron? A: Cameron realized that a Series 1000 Terminator was active in the past. It's also possible that she knew at this point that she"} +{"output_text": "\u2019t sleep with my feet uncovered is because I\u2019m afraid of La Mano Peluda. This is a creature that is said to be a hybrid of a human and a monkey. It\u2019s a creature that is said to be able to climb trees and leap from one to another. It\u2019s also said to be able to climb walls and jump from one to another. It\u2019s also said to be able to climb up to the roof of a house and jump from one to another", "input_text": " more familiar territory. I don\u2019t think any Latinx legend is as well known in the United States as La Chupacabra, the goat sucker. Known as the scourge of farmers and ranchers, the chupacabra is a creature that sneaks onto the land at night and kills the livestock of the owners. It leaves them mutilated. If you\u2019re a Mexican child it waits in the trees to eat you if you don\u2019t come inside. But really, this one was less of a story my wela told so much as it is a story that the US was fascinated with in the 1990s and subsequentially we started asking questions about.\n\nIn truth, La Chupacabra was first reported in Puerto Rico and is now known across the South West. So much so that I remember seeing reports of Chupacabra sightings on the nightly news in San Antonio. This is a cryptid that started in Puerto Rico but has since moved to northern Mexico, Chile, the Dominican Republic, Bolivia, states on the border, and other places across Latin America.\n\nLos Duendes\n\nThese creatures are from Latin American, Filipino, and Spanish culture. They\u2019re essentially little gnomes who, in Mexican traditions according to my wela, live in the walls of kids\u2019 rooms. They wait there until you\u2019re asleep and if you\u2019re a dirty child who refuses to bathe they cut your toenails and almost always miss and cut off your toes. This is also one of two reasons why I refuse to sleep with my feet uncovered. There are still some states in Mexico, according to Border Town writer Esquivel that will allow you to break your lease without penalty if you claim duendes in the walls.\n\nLa Mano Peluda\n\nReason number two that I don"} +{"output_text": "I.K.E.3 is the first keyboard to feature a membrane key-bed, which offers the full tactile feedback of mechanical keys but without the excessive noise or the need to 'bottom out' the keys. The S.T.R.I.K.E.3 is also the first keyboard to feature a removable wrist-rest, which can be used to adjust the height of the keyboard for a more comfortable typing experience.\n\nThe S.T.R.I", "input_text": " available in gloss black, gloss white and gloss red colors.\n\nMAD CATZ Announces S.T.R.I.K.E. 3 Professional Gaming Keyboard for Windows PC\n\nTwelve Macro Keys and Thirty-Six Programmable Comands Enhance Competitive Gaming\n\nSan Diego \u2013 June 10, 2013 \u2013 Mad Catz Interactive, Inc. (\"Mad Catz\") (NYSE MKT: MCZ) announced today the S.T.R.I.K.E.3 Professional Gaming Keyboard for Windows\u00ae PC. Expected to ship fall 2013, the S.T.R.I.K.E.3 has been designed with the competitive gamer in mind, offering an impressive feature set and a unique membrane key-bed designed to offer the full tactile feedback of mechanical keys but without the excessive noise or the need to 'bottom out' the keys.\n\n\"The S.T.R.I.K.E.3 demonstrates our commitment to providing gamers with a product range that meets their budgets and exceeds their expectations,\" said Darren Richardson, President and Chief Executive Officer of Mad Catz Interactive. \"Our range of S.T.R.I.K.E. keyboards has captured the imagination of passionate gamers and we are pleased to expand the range with the S.T.R.I.K.E.3.\"\n\nThe S.T.R.I.K.E.3 features a full RGB backlit key-bed, capable of displaying up to sixteen million customizable colors. In addition to full media controls and a removable wrist-rest, the S.T.R.I.K.E.3 features a total of twelve macro keys and three separate modes of operation, providing a total of thirty-six programmable buttons.\n\nThe S.T.R."} +{"output_text": "rd W4- 2014, 1st W4- 2015, 1st W4- 2016\n\nLuczak: 1st W4- 2013, 1st W4- 2014, 1st W4- 2015, 1st W4- 2016\n\n2016 record: 1st W4- Lucerne World Cup, 1st W4- Varese World Cup\n\nThe USA\u2019s women\u2019s rowing team has been a bit of a mixed bag over the last", "input_text": "\n\nWorld Championship record\n\nPersse: 6th W2- 2011, 6th W2- 2013, 6th W2- 2014, 5th W2- 2015\n\nChristowitz: 4th U23 W2- 2010, 5th U23 W2- 2011\n\n2016 record: 1st W2- Varese World Cup, 4th W2- Lucerne World Cup\n\nGiven the size of the South African women\u2019s rowing squad, it\u2019s perhaps no surprise that Persee has spent her entire international career racing in the W2-, making her debut at the Munich World Cup in 2010 partnering Hayley-Jo Arthur. In 2011 Arthur was swapped with Naydene Smith and they finished 6th at the Bled World Championships. This pair raced together for the next 4 years finishing 6th at each of the next 3 World Championships and 8th at the London Olympics. In 2015 they finally broke their \u201c6th place hoodoo\u201d and went one better, finishing 5th at the Aiguebelette World Championships. In 2016 a new partnership was formed with Christowitz with immediate results, winning gold at the Varese World Cup, South Africa\u2019s 1st ever gold medal in this event and their first medal since the 2nd World Cup in 2011. For a new partnership that\u2019s not a bad way to start and for Christowitz, making her senior International debut it was pretty impressive. At Lucerne they just missed out on the medals but their form this season has shown that they cannot be discounted for a medal although I think they may find the going tough against the British, Kiwis and Americans.\n\nThe USA\n\nFelice Mueller 26 & Grace Luczak 27\n\nOlympic record: 2016 Olympic debut\n\nWorld Championship record\n\nMueller: 1st W4- 2013, 3"} +{"output_text": "\u2019s just how it is.\n\nI\u2019m not saying that I\u2019m not a private person, but I\u2019m not trying to be a private person. I\u2019m just trying to be a person.\n\nI\u2019m not trying to be a private person. I\u2019m just trying to be a person.\n\nI\u2019m not trying to be a private person. I\u2019m just trying to be a person.\n\nI\u2019m not trying to be", "input_text": " music?\n\nI mean popular music. It\u2019s not even like I\u2019m criticizing a person, I just think there\u2019s nothing redeeming about \u2026 I really don\u2019t want to be a jerk, but it\u2019s really just un-human music.\n\nYou were talking about colors and light to describe the sound of Fading Frontier. For me, this is a really warm record. It has a homey feel. This might sound weird, but I mean it as a compliment: This has been a really good record to do the dishes to.\n\nIt\u2019s kind of a domestic record. It\u2019s very domestic.\n\nIn \u201cLiving My Life,\u201d there\u2019s a line about being off the grid. I fantasize about that sometimes.\n\nI guess it\u2019s like a thing, a phrase, where people go off the grid without their cell phones. I didn\u2019t realize that was such a common idea. I guess I\u2019d heard it, but I didn\u2019t really know what it meant. It actually still works.\n\nDo you consider yourself to be an autobiographical songwriter, or has the personal nature of your songs been overstated?\n\nI\u2019ve never been autobiographical. I mean, there\u2019s no question that aspects of my life sneak in there, but I always try to mute things through a lens.\n\nWhy do you think your songs are read as personal statements?\n\nIt\u2019s kind of like, Why do you jack off with your right hand or your left hand? You just learn that way \u2014 it\u2019s your instinct. What hand do you jack off with? I don\u2019t mean to make it perverted. I\u2019m just saying there\u2019s certain things that are considered in general terms private. Like secret, private things, that you do alone, and that"} +{"output_text": " you\u2019re a new player and don\u2019t have a lot of money to spend, then I would say, \u201cdefinitely not\u201d.\n\nBut if you\u2019re a veteran player, and you have a lot of money to spend, then I would say, \u201cdefinitely yes\u201d.\n\nI\u2019m not saying that Kuroyuri is the best thing ever, but she\u2019s definitely a good addition to the meta.\n\nI\u2019m not saying that Kuroy", "input_text": " brings it up to 4x HP, at the cost of no RCV multipliers. This is where fitting TWO Kuroyuri onto the team really starts gaining value.\n\nTo give you a better idea of how tanky these teams are, the people over at the JP server calls these 4x HP teams as \u201cunsinkable fleets\u201d:\n\n\n\nDaks\u2019s Barbara & Julie team\n\nOgre\u2019s Noir team\n\nKAZUKI\u2019s Dark/Fire Rajoa team\n\n\n\nAnd people have been using Kuroyuri as inherits onto similar team:\n\n\n\nOgre\u2019s Vraska team, using Tadis as base for Kuroyuri\n\nNao\u2019s Ameno Costume Gintoki team, using Revo Isis as base\n\n\n\n\u2026 and etcetera. Similar builds can be used with the new Mega Awoken Evos of Ryune and Sylvie.\n\nI have seen some discussions on Reddit and elsewhere, that the \u201cKuroyuri Loop\u201d can be subbed out with healing subs such as Eir or Mel. This is definitely debatable; my opinion is that Kuroyuri is way more reliable and worry-free. A small little bonus on top of this, is that her skill also hastes 1 turn. Makes stalling that much less painful. When tackling an unforgiving dungeon such as Alt Arena, every little thing counts.\n\nThat being said, is buying TWO Kuroyuri worth 1,500,000 MP for the meta that\u2019s to sweep NA? My verdict is, \u201cprobably not\u201d. For most people, they won\u2019t have 1.5m MP on hand to spend by Oct 31st (when Kuruyuri/Leo finally leaves). If you have to sell REM cards to scrape together enough MP, OR if"} +{"output_text": " the game seriously.\n\n\u2013 We are a team of friends who have been playing together for a long time. We have a lot of experience and we are all very good friends. We are not a team that is just here to make money, we are here to have fun.\n\nThe team has been playing together for a long time, but the team has been around for a while.\n\n\u2013 We have been playing together for a long time, but we have been playing together for", "input_text": " a warning. And they should be there independent on whether the casters are ready or not, Mod645 says to Aftonbladet Esport and continues:\n\n\u2013 But you have to teach the player that it we are the one controlling the event. As of late, I don\u2019t have to yell at them as much as I used to because they know what\u2019s up.\n\nThere\u2019s been so many delays in CS:GO tournaments in recent years, are some of the attributed to the players?\n\n\u2013 I haven\u2019t really understood how that can be. One hour before the game I ask the teams if they are ready and make sure they will be on the server on time. Once the knife round has ended, players can tell the admin that they have to pee, at those moments I start asking my self \u2019you had an hour to prepare for the game, how can that preparation not include the toilet?\u2019 And that stuff happens on LAN as well.\n\nHow have the recent success on your admin career felt for you?\n\n\u2013 It\u2019s been fantastic, overwhelming. I don\u2019t think I have realized how far I\u2019ve come just yet.\n\nThe referee of some of the biggest games in the Counter Strike world is also a huge fan of the game. While his time playing the actual game have declined a lot since the work load have risen, his effort with his team Embrace have not.\n\nThe winners of Birdie LAN is a team created by David \u201dGodlike\u201d Livmar and during the 1.6 era had amongst many, Patrik \u201df0rest\u201d Lindberg in the line up. After a few years as a resting organization, Godlike once more brought the team back to participate in the CS:GO community.\n\nMod645 describes the team as a dedicated group of friends who take"} +{"output_text": " Russia, antifa MORE said in a new interview that he was \u201cstunned\u201d by President Trump Donald John TrumpOmar fires back at Trump over rally remarks: 'This is my country' Pelosi: Trump hurrying to fill SCOTUS seat so he can repeal ObamaCare Trump mocks Biden appearance, mask use ahead of first debate MORE\u2019s decision to fire him.\n\nComey said in an interview with ABC News that he was \u201cstunned\u201d by Trump\u2019s decision to", "input_text": ", easy as pie.\n\n\n\nFigure 19 Figure 19\n\nScenario 2: There are two opposite-colored cubes adjacent to each other. Same deal as scenario 1 but you'll need to perform it twice (once to get them to the same color, once again to orient them correctly.)\n\n\n\nFigure 20\n\nScenario 3: One corner piece is solved, but there are 3 that are unsolved. In this case you'll have to use your best judgment using your knowledge. If two corners have adjacent colors facing up, set up those two on the right face and use the algorithm in that orientation first. If not, it's super important to orient one of the outside corners first. If you solve the corner connecting the other two you will have two non-contiguous unsolved corners, which will just take you longer to solve.\n\nScenario 4: Two unsolved corners at opposite corners of the cube. This is the biggest pain in the butt. See if you can solve one of the cubes by picturing how it will be oriented after you use the algorithm (refer to figure 18.) This will most likely take 3 tries until it's solved. So after you finish this step you should be done!\n\nThis is the first version if this tutorial. There are likely typos which aren't a huge deal, but if you keep trying and trying and this guide isn't helping you solve your Rubik's cube, there may be a typo in one of the algorithms. If this is the case, please don't hesitate to e-mail and I'll see if I can fix it. dougwelch@gmail.com Former FBI Director James Comey James Brien ComeyDemocrats fear Russia interference could spoil bid to retake Senate Book: FBI sex crimes investigator helped trigger October 2016 public probe of Clinton emails Trump jabs at FBI director over testimony on"} +{"output_text": " Joey would rap. He\u2019d write rhymes on the back of his mom\u2019s paychecks, and he\u2019d record them on his dad\u2019s boom box. \u201cI was always writing, but I didn\u2019t know what I was doing,\u201d Joey says. \u201cI was just trying to rhyme.\u201d\n\nJoey\u2019s mom, a single mom herself, was a big influence on his music. She\u2019d always been a fan of hip-hop, and she", "input_text": " on the stoop outside the apartment of Cinematic Music Group founder and CEO Jonny Shipes, eating chicken and rice his mom made out of a Tupperware container. It\u2019s one of the first cool afternoons in September, and the young rhymer is rocking jeans and a navy Supreme hoodie to protect against the chill. Lately, Joey\u2019s been dealing with the pressure to finish his debut album, B4.Da.$$, due out by the end of the year, but he\u2019s in no rush to get moving.\n\nLife has come fast for the young MC. Born Jo-Vaughn Virginie, the teenage rap prodigy landed like napalm on an increasingly stale New York rap scene with his breakout mixtape, 1999, in June 2012. The tape earned praise for its blend of uncompromising lyricism and old school, boom bap beats from the likes of J Dilla, MF Doom and Lord Finesse. In the two years since, Joey\u2019s gone from freestyling in the hallways of his high school to headlining national tours with his rap crew, Pro Era and appearing on magazine covers as part of the 2013 XXL Freshmen Class. In many ways, Joey embodies the current feel of Brooklyn itself: holding onto history while updating its infrastructure for a new generation.\n\nBorn and bred in Brooklyn, Joey was drawn to hip-hop since he was a toddler, running to catch The Notorious B.I.G. music videos any time they came on TV. As a lone kid growing up in a single-parent home in Flatbush\u2014his father left his mother when Joey was five, though they\u2019ve maintained a good relationship\u2014he was often alone while his mom worked to support them.\n\nIt was during these periods, before and after school, that"} +{"output_text": ". She said she was born in Mexico and came to the US with her parents when she was three years old. She said she was shocked by Gruters\u2019s response.\n\n\u201cI was really surprised because I thought he was a good person,\u201d Chalco said. \u201cI thought he was a good person because he was a Republican. I thought he was a good person because he was a senator. I thought he was a good person because he was a politician. I thought he was a", "input_text": " from targeting only criminals and those who Trump considers to be \u201cbad hombres\u201d, families like Roxana\u2019s, with undocumented but working parents, would be at increased risk of deportation. They say a simple traffic stop could lead to an arrest and detention that law enforcement would be compelled to report to Immigration and Customs Enforcement before keeping that person in custody for up to 48 additional hours, to await collection by federal agents if Ice issued a detainer request.\n\n\u201cI don\u2019t understand why they get angry with families like ours that just want to have a better life,\u201d said Lily Montalvan, who has looked after Roxana and her son Ronnie, 16, alone since her husband, a construction worker, was deported.\n\n\u201cI had a beautiful family, always together. For my husband, his life was his work and us. This has destroyed us and I do not know how we are going to continue.\u201d\n\n\u2018It is horrible\u2019\n\nMontalvan and Roxana met Gruters in Tallahassee last week as part of a hundreds-strong delegation from groups including the Florida Immigrant Coalition (Flic) and United We Dream. She said she told him her husband was not a criminal but a hard worker with valid US government labour certification whose only goal since arriving from Peru in 1988 was to raise and support his family. She said Gruters, also the chair of the Republican party of Florida, was unmoved.\n\n\u201cI asked my daughter how she felt leaving the senator\u2019s office and she said, \u2018It is horrible,\u2019\u201d Montalvan said.\n\nI don\u2019t understand why they get angry with families like ours that just want to have a better life Lily Montalvan\n\nAlso among the delegation was 19-year-old Nataly Chalco, a student of political science and economics at Florida State University"} +{"output_text": " thing I didn\u2019t like was the fact that it didn\u2019t really remove any dead skin, but I\u2019m not sure if that\u2019s because I didn\u2019t use enough, or because it\u2019s not a strong enough product. I\u2019d definitely recommend it, and I\u2019d use it again.\n\nI\u2019ve been using this for a few weeks now, and I\u2019m really pleased with it. I\u2019ve been using it as a toner, and I\u2019", "input_text": " to run low \u2013 the design means it looks like it\u2019ll be difficult to get it out), and you don\u2019t have to use much so it will go a long way. Squeeze a small amount onto your fingertips, and rub into your face, constantly rubbing for 2-3 minutes.\n\nAfter just a minute, some skin started coming off my face, so it does seem to be effective. I say some, as peeling gel naturally balls up, as it were, when you rub \u2013 some of it is the product itself binding together, and some of it is your skin. Peeling gel gets rid of icky dead skin, and it is recommended no more than twice a week. Once is advisable. Never overdo this! Whilst the ingredients are not strong chemicals by any means (I know most equate skin peeling with something strong), as with exfoliation, you shouldn\u2019t overdo it. This is basically another way of exfoliating. After rubbing it into your skin and seeing the skin come off, wash the face thoroughly with warm water, and continue with your routine.\n\nThe day I used this, my skin felt really good and much cleaner, if that is a good way to describe it. It felt just that bit softer, and I did notice change to the skin on my forehead, which seemed a little less oily and just looked more even all over. If you have sensitive skin I\u2019d recommend this product, as it wasn\u2019t harsh and is much nicer than an exfoliation (which I hate on my skin). Use it if scrubs or things with bits in are too much for your poor skin!\n\nVerdict: I really liked this product. I like the packaging, the fact that it doesn\u2019t have any kind of scent to it, and what it did to my skin. The only"} +{"output_text": ".\n\nI\u2019m not saying that Pochettino is wrong to want to sign a player who is ready to play for the first team, but I am saying that he should be more consistent in his approach. He\u2019s not doing it for the sake of it, but because he genuinely believes that Edwards is ready. But if he doesn\u2019t, then he should be consistent in his approach.\n\nI\u2019m not saying that Pochettino is wrong to want to", "input_text": " satisfied that Edwards is \u2018ready\u2019, then he could be waiting a while and risk one of our greatest homegrown talents leaving. Pochettino has a ready-made excuse should Edwards\u2019 not make it: he wasn\u2019t right mentally. He had issues with authority. He didn\u2019t perform in training. And yet if he *does* make it, he claims all of the credit. That doesn\u2019t feel right; this is a joint venture. Recent press conference comments have made it feel otherwise.\n\nUltimately in Edwards\u2019 case I think this comes down to talent against mentality, and Pochettino\u2019s flexibility with certain players and not others. I know well enough from my own profession that, as a manager, it\u2019s impossible to treat everyone equally, because everyone is different with different motivations and values. But being seen to treat people consistently is important, and if Edwards sees concessions given to those who don\u2019t play as well as him or those who act up and still get games, then I imagine he\u2019s going to find that frustrating.\n\nThe non-selection of homegrown players isn\u2019t just a Pochettino issue, it\u2019s an English football issue. The mentality towards youth players *has* to change, because the levels of English and English-grown youth players have changed. Recent competitions suggest that England are producing some of the best youth players in the world; these are excellent footballers who will not let their teams down. And Spurs have one of the top four or five academies in the country, perhaps even top three. Signing a cheap foreign back-up is not necessary because we have *free* back-ups waiting in the wings who just need a chance to be taken on them. Nobody can convince me that Onomah wouldn\u2019t have done at least a good a job as Sissoko in our midfield"} +{"output_text": "I\u2019m so happy to be with my best friend and my boyfriend\u2019\n\nThe next day, Lopez posted a photo of himself and Maisani on Instagram, writing: \u2018I\u2019m so happy to be with my best friend and my boyfriend.\u2019\n\nCooper did not acknowledge his relationship with Maisani until May of 2015 when he posted a photo of the two men on Instagram, writing: \u2018I\u2019m so happy to be with my best friend and my boyfriend.\u2019\n\nThe", "input_text": " their AC2 show at the Beacon Theater in New York City, with Copper's boyfriend Maisani in attendance (Cohen and Cooper with Dolores Catinia and Maisani over Cohen's shoulder to the right)\n\nAnd he's off: Two days later, the manager at the Dallas pizzeria Cane Rosso posted a photo of himself with Cooper (above)\n\nTrading places: Maisani caused tongues to wag after he was spotted kissing another man in Central Park back in 2012 (above)\n\nThat outing was reported on by one excited birthday girl and their server, who both noted on social media that seeing Cooper was a highlight of their day.\n\nOne of the women was also able to confirm that Cooper was in fact with Lopez at the restaurant.\n\nCooper even agreed to pose for a picture with the manager, though Lopez did not join him in the photo.\n\nThat trip came after Cooper had performed his AC2 show in New York City alongside Andy Cohen, with his live in Maisani one of the guests in attendance.\n\nThe next stop on that tour was in Boston, and Lopez did the travelling that time, heading up to Boston to watch Cooper perform and posting photos from the audience.\n\nThe show was in Boston on the night of February 10th, and the doctor posted a selfie flying home from Boston on February 11th.\n\nFly the friendly skies: Lopez shared this selfie on his changed Instagram account flying home from Boston on February 11th, the day after Cooper played a show in the same city\n\nFriends and family: Cooper also posed up with Lopez's friends in a photo taken during his birthday weekend (above)\n\nOut and about: Cooper did not acknowledge his relationship with Maisani until May of 2015 when he posted a photo of the two men on Instagram, writing: \u2018"} +{"output_text": " he tweeted, \u201cThe Democrats are pushing for Open Borders, which will bring in all of the drugs that they can, and they will put millions of additional people on our country, and they will raise taxes through the roof. They will destroy our Military, and they will get rid of V.A. [Veterans Affairs] and C.A. [Civilian Agencies].\u201d\n\nThe Russian fakes and the Trumpian tweets are not just similar in content. They", "input_text": " up now to object that Khusyaynova and \u201cProject Lakhta,\u201d whose finances she masterminded and directed, did their politicking through fake personas and bogus social media accounts and Trump does his sowing under his own signature. But setting that aside, many of the Russian rants collected in the criminal complaint overlay Trump\u2019s tweets almost perfectly.\n\nFor instance, Project Lakhta members at the Facebook group \u201cSecured Borders\u201d gave these Trumpian tips on what to write on social media platforms. \u201cBrand [Sen. John] McCain as an old geezer who has lost it and who long ago belonged in a home for the elderly.\u201d Speaker Paul Ryan was to be branded as \u201ca complete and absolute nobody incapable of any decisiveness.\u201d Special counsel Robert Mueller? \u201c[A] puppet of the establishment.\u201d Sen. Marco Rubio? \u201c[A] fake conservative who is a traitor to Republican values and who in his soul despises the American Constitution and civil liberties.\u201d The leadership in sanctuary cities should be characterized as having \u201clost all connection with reality\u201d and \u201ctrying to provide criminals who illegally crossed the U.S. borders with voting rights that are available only to the citizens of the United States.\u201d\n\nCompare these Russian fakes with genuine expressions from the president\u2019s keyboard. Today he tweeted, \u201cSadly, it looks like Mexico\u2019s Police and Military are unable to stop the Caravan heading to the Southern Border of the United States. Criminals and unknown Middle Easterners are mixed in. I have alerted Border Patrol and Military that this is a National Emergy. Must change laws!\u201d On Friday he wrote, \u201cBeto O\u2019Rourke is a total lightweight compared to Ted Cruz, and he comes nowhere near representing the values and desires of the people of the Great State of Texas. He will never be allowed to turn Texas into Venezuela!\u201d Last week"} +{"output_text": " in jail, I\u2019d rather be in jail for a drug offense than for a homicide offense,\" Paul said.\n\n\u2022 Dr. Peter Reuter is a professor of medicine at the University of Toronto and the author of the book Addiction: A Global Epidemic.\n\nSign up for the Guardian US opinion newsletter\n\n", "input_text": "\nIn the 1980s, energetic marijuana interdiction led to a repurposing of trafficking routes towards less bulky and more lucrative cocaine. Even before widespread fentanyl, the potency of street heroin had already been steadily increasing. Each pyrrhic victory of the war on drugs has left us with stronger and more harmful drugs.\n\nA tragic irony of Paul\u2019s death is that he was in the middle of advocating for a supervised injection facility in the neighborhood to replace the filthy, dangerous spaces where hundreds of doses of heroin are injected daily.\n\nOpioid crisis: Donald Trump rejects calls to declare national emergency Read more\n\nSupervised injection facilities are an evidence-based way to keep people alive by allowing them to inject drugs under medical supervision during this uniquely dangerous time to struggle with addiction. There are already more than 100 of these facilities in 66 cities across nine countries including Canada. None of these countries is facing the scale of the crisis that we in the US face and yet they rightly eschew moralizing for the sake of public health. A facility like this in the US would start saving lives the day it opens.\n\nWhile pursuing necessary emergency measures, like supervised injection facilities, we need to also make every investment to achieve treatment on demand. Currently, only 10% of those in need of addiction treatment receive it. Buprenorphine and methadone are two evidence-based options for opioid agonist therapy that have repeatedly been shown to reduce death and support recovery and yet we drag our feet in matching access to need.\n\nA crisis of this magnitude demands an energetic, open and scientifically-grounded consideration of every available tool at our disposal and the political will to follow where the science leads. The moralizing punitive logic of the war on drugs helped lead us into this crisis. Only a public health approach can lead us out. \"If I\u2019ve got to be"} +{"output_text": " he and Curry said they believe the Warriors will be back in the Finals this season.\n\n\"I don't think it's over,\" Green said. \"I think we're going to be back. I think we're going to be back. I think we're going to be back. I think we're going to be back. I think we're going to be back. I think we're going to be back. I think we're going to be back. I think we're going", "input_text": " season.\n\n\"I am not sure what everyone's expectations are,\" Green told ESPN. \"We haven't really had any team meetings yet. But I know what my expectations are. They don't change from year-to-year. They are always the same.\"\n\nWhen asked specifically if he thinks the Warriors could go to the NBA Finals this season, Green was quick with his answer.\n\n\"Is Steph Curry on our team? Klay Thompson?\" Green said confidently. \"Yup.\"\n\nThe Warriors' roster is in transition after losing superstar forward Kevin Durant in a sign-and-trade with the Brooklyn Nets. Also gone are veteran stalwarts Andre Iguodala, who was traded to the Memphis Grizzlies in a corresponding move, and Shaun Livingston, who retired.\n\nDespite all the movement, both Green and Curry said they believe the expectation for the group should be the same as always: a trip to the NBA Finals.\n\n\"For sure,\" Curry told ESPN. \"As long as we have the solid core that we've got and the experiences to kind of back us up, we're going to keep that goal in mind. The fun part about it is that we get to kind of recreate the look of it and incorporating the new pieces that we have, and that part's the most exciting. The last five Finals have kind of been a certain way, so whatever we do from here, you're going to enjoy it even more.\"\n\nDraymond Green said on Monday that he has no doubt the Warriors will make the NBA Finals again this season. Andrew D. Bernstein/NBAE via Getty Images\n\nAfter losing in six games to the Toronto Raptors in the NBA Finals in June, Green said it would be \"stupid\" to think the Warriors' championship run is over. Both"} +{"output_text": " part of a new five-year plan. But the agency has yet to provide a timeline for the remaining long-term advisories.\n\n\u201cWe\u2019re not going to be able to say that we\u2019re going to be able to get rid of all of them in five years,\u201d said Mr. Sajjan. \u201cBut we\u2019re going to be able to get rid of a lot of them.\u201d\n\nThe federal government has also committed to spending $1.8-billion", "input_text": ". He promised to end all of them within five years; his government committed $1.8-billion toward upgrading water and wastewater systems.\n\nData published on Indigenous Services Canada\u2019s website show that 78 advisories have been rescinded since November, 2015. Most were resolved through repairing or replacing failed water-treatment plants, wells, distribution systems or other infrastructure. A handful were eliminated through improved water-quality monitoring and sampling, and a few more by connecting communities to systems in nearby municipalities. Meanwhile, more than 30 new advisories have stretched on for longer than one year; over all, 62 long-term advisories remain outstanding today.\n\nStory continues below advertisement\n\nBut other federal data suggest the condition of First Nations water systems hasn\u2019t changed much. Indigenous Services Canada uses a database called the Integrated Capital Management System (ICMS) to assess the risk that water systems present to the people they serve. Annual inspections assess each system\u2019s design, how well it\u2019s being operated and maintained, record keeping, the operators' training and the quality of source water; systems are then scored between 1 (presenting very low risk of producing unsafe water) and 10 (extreme risk).\n\nAn analysis of 11 years of ICMS data by The Globe, covering some 14,000 individual inspections, shows the national average risk score among the nearly 800 systems tracked on the ICMS barely budged since 2015.\n\nThis apparent lack of progress can be partly explained by the fact that some recently fixed water systems have yet to be re-inspected. That means the ICMS data doesn\u2019t fully reflect recent improvements. But the risk scores also point to something else: Ending advisories means something quite different than providing consistently safe, high-quality drinking water.\n\nIndigenous Services Canada said the remaining long-term advisories will be terminated by March, 2021, as"} +{"output_text": " Brasil 200.\n\nEu n\u00e3o sei se ele foi.\n\nO senhor n\u00e3o teme que o vice-presidente do Brasil 200 seja um dos que vai se colocar na frente do governo?\n\nN\u00e3o.\n\nO senhor n\u00e3o teme que o vice-presidente do Brasil 200 seja um dos que vai se colocar na frente do governo?\n\nN\u00e3o.\n\nO senhor n\u00e3o teme que o vice-presidente", "input_text": " ter de testar.\n\nMas garanto que at\u00e9 1%, com certeza, ele \u00e9 insoneg\u00e1vel porque qualquer alternativa para se evadir do pagamento custa mais de 1%.\n\nComo?\n\nSe voc\u00ea hoje paga 3% ou 4% pela conveni\u00eancia de usar o cart\u00e3o de cr\u00e9dito no sistema banc\u00e1rio moderno, sem ter de andar com malas de dinheiro, porque \u00e9 que voc\u00ea vai passar a andar com malas de dinheiro para economizar o 1% do imposto?\n\nSe voc\u00ea come\u00e7ar a fazer como o Geddel [Vieira Lima, ex-ministro de Michel Temer] e botar um monte de dinheiro em casa, tem uma eros\u00e3o, imposto inflacion\u00e1rio.\n\nEu n\u00e3o sei por que voltam nessa cr\u00edtica. Isso mostra deslealdade intelectual.\n\nVoc\u00eas est\u00e3o se colocando nessa guerra de iniciativas de reforma tribut\u00e1ria, com a proposta do movimento Brasil 200. Aonde voc\u00eas acham que v\u00e3o chegar? Porque a proposta do governo \u00e9 a do secret\u00e1rio Marcos Cinta, que j\u00e1 disse que a dele \u00e9 outra?\n\n\u00c9 praticamente a mesma.\n\nO Marcos Cintra j\u00e1 falou que a dele n\u00e3o \u00e9 de voc\u00eas. O Senado tem uma outra e a C\u00e2mara tem outra?\n\nEu n\u00e3o vou entrar nessa briga fulanizada. Tem egos demais nesse neg\u00f3cio.\n\nO vice-presidente Hamilton Mour\u00e3o foi ao evento de lan\u00e7amento da proposta de voc\u00eas do grupo"} +{"output_text": " in the plan, but the bottom line is that Warren\u2019s plan would cost $32.6 trillion over 10 years, and generate $32.6 trillion in savings.\n\nThe plan would be financed by a combination of new taxes and cuts to existing programs.\n\nWarren\u2019s plan would also eliminate the employer mandate, which would save $1.5 trillion.\n\nWarren\u2019s plan would also eliminate the individual mandate, which would save $1.2 trillion.", "input_text": ", and Simon Johnson, the former chief economist at the World Bank. Berwick was an advocate for the Independent Payment Advisory Board, which was envisioned as a body to cut Medicare costs and ration care under the ACA.\n\nIn the course of a decade, Warren\u2019s Medicare for All would achieve savings on the following basis:\n\n\u2022 Private insurers currently consume about 12.2 percent for \u201cadministrative costs\u201d and profits. Warren assumes this would fall to 2.3 percent, saving $1.8 trillion.\n\n\u2022 Warren proposes cutting payment rates for brand-name drugs by 70 percent, saving $1.5 trillion.\n\n\u2022 All physicians would be paid at current Medicare rates, and hospitals would be paid at 110 percent of that rate, saving an estimated $600 billion.\n\n\u2022 ACA-era payment \u201creforms\u201d would be implemented across the single-payer system, moving away from fee-for-service. This would save an estimated $2 trillion.\n\n\u2022 $1.1 trillion could be saved by holding health spending growth to 3.9 percent over the next decade.\n\nAdditional funding would be generated by raising taxes, including:\n\n\u2022 A financial transactions tax of 0.1 percent of the value of every stock, bond or derivatives transaction, raising $800 billion.\n\n\u2022 A 35 percent minimum tax on foreign earnings, bringing in $2.9 trillion.\n\n\u2022 A 6 percent wealth tax on assets over $1 billion, generating $1 trillion.\n\n\u2022 Taxing capital gains for the top 1 percent at the same rate as normal income, and doing so annually, would raise $2 trillion.\n\nAnother major source of revenue would result from private employers paying to the government the $9 trillion they would have spent on private health insurance for their employees.\n\nThere are many other convoluted details"} +{"output_text": " tell you how many times I've heard people say, \"Well, I don't know if it's torture, but it's certainly cruel and unusual punishment.\"\n\nYoo says, \"That's exactly right. That's exactly right. That's exactly what I was thinking.\"\n\nHe says he was also influenced by the fact that the U.S. Supreme Court had never defined torture.\n\nI was trying to figure out what the Supreme Court would say.\n\nYoo", "input_text": ", that there is something irreducible and inviolable about every person,\" says Yoo's fellow Berkeley law professor\n\nRobert H. Cole. \"You can't write a memo about it the way you would write about snowmobiling in Yosemite.\" At the very least, they say, Yoo should have warned of the moral danger the question posed to the essence of America.\n\nYoo says he shared those concerns. He says he thought he was writing a memo for exceptional cases, for the highly trained specialists of the CIA. \"I never thought it would be a good idea for the Army to do it, to put it in the hands of eighteen-year-old kids. But it would be inappropriate if I had that worry and it changed the way I interpreted the law.\"\n\nSo he buckled down to one of the world's most thankless jobs, defining the limits of acceptable pain. He knew it would be easy to draw a vague standard that sounded good and then give the CIA a meaningful wink. But that wouldn't be fair to the officers in the field.\n\nHe wanted to draw a clear line.\n\nThe problem was, the Justice Department had never prosecuted anyone under the antitorture statute, so there were no judicial opinions to guide him. Dictionaries defined severe as \"extreme\" and \"hard to endure.\" Yoo studied all the international precedents he could find, including the judgment of the European Court of Human Rights in Ireland v. the United Kingdom, which found that the use of hoods, continuous loud noise, sleep deprivation, reduced diet, and a stress position called \"wall-standing\" were all cruel and degrading but not torture.\n\nSo where was the line?\n\nHe got the crucial phrasing about organ failure and death from a U. S. law concerning health care.\n\nI can't"} +{"output_text": ". And that is what I have tried to do in this speech.\n\nI have tried to be clear about the challenges we face, and the choices we face. I have tried to be clear about the values we share, and the values we do not. I have tried to be clear about the values we need to defend, and the values we need to change. I have tried to be clear about the values we need to build, and the values we need to preserve.\n\nI", "input_text": "isobhani\u2019s mother feeling alone in a new country.\n\nKnowing that the same thing could happen to her daughter has left the mom \u201cdevastated,\u201d Alisobhani said.\n\nCourtesy Nassim Alisobhani Nassim Alisobhani's parents were married in 1986.\n\n\u201cMy parents\u2019 wedding was great, but my mom always spoke of it as a sad moment for her,\u201d Alisobhani said. \u201cI\u2019m not going to be as lonely as her, but it\u2019s still going to be a dark spot.\u201d\n\nBut, Alisobhani said, it\u2019s more than just about her family \u2015 it\u2019s about the Syrian refugees who are being turned away, students whose educations are at risk of being disrupted, and others looking to come to America. There has never been an easy time to be a social democrat (or \u201cdemocratic socialist\u201d as we sometimes call ourselves in Britain). Whereas the right can demonise the poor and extol the virtues of the market, and the hard left can demonise the market and extol the role of the state, our position of constraining the domination of markets and reforming the state is, by definition, more complex.\n\nIt is nonetheless the case that social democracy has a historic responsibility, in every generation, to renew democracy and preserve a civic culture. This is achieved not through soundbites and slogans, but through the hard-headed development of a progressive politics that reconciles liberty and democracy, new comers and locals to our communities, business and workers, in a common life that preserves security, prosperity and peace. This historic mission is all the more urgent now and my determination that we succeed has grown not weakened since our election defeat last May.\n\nBut, in order to be heard, it is necessary to make balanced and reasonable argument"} +{"output_text": " you possibly make this show interesting? You couldn\u2019t. You couldn\u2019t even make it interesting for the fans of the original manga. You couldn\u2019t even make it interesting for the fans of the original manga. You couldn\u2019t even make it interesting for the fans of the original manga. You couldn\u2019t even make it interesting for the fans of the original manga. You couldn\u2019t even make it interesting for the fans of the original manga. You couldn\u2019t even make it interesting", "input_text": " stupid, skeevy characters, ALIENS *hand gestures*, and a mecha that really didn\u2019t even fucking matter. It was the end of the Urobutcher\u2019s hot streak, leading to Aldnoah.Zero next year.\n\nDishonorable Mentions:\n\n\nCoppelion - Originally announced in 2010, this anime adaptation of a 2008 manga was delayed due to a major plot twist in the worst anime of all time, Real Life. An actual nuclear disaster in Japan ended up causing this disaster anime about a fictional nuclear meltdown to be delayed for three years. You would think having to wait so long means it\u2019s a graphic, gut-wrenching depiction of the horrors of a nuclear accident, right? No! It\u2019s fucking boring! How do you make a disaster show boring?! Even the color palette sucks! It\u2019s all washed out like the picture above.\n\n\n\n\nFlowers of Evil - Based off a very well-received manga, everyone who was anticipating a really good horror thriller anime to watch weekly was sucker-punched by the opening episode and its suspect animation choices (see adaptation below), But even with bad and baffling choices that were specifically asked for by the original author, we could still get the story we loved, right? Well, no, because they also decided to slow the pacing to an absolute crawl. Even if this was animated well, the story would still have been the worst thing an anime could be: EXTREMELY BORING.\n\n\n\n\nWanna Be The Strongest! - Speaking of boring, how the fuck did you make a show about FEMALE IDOLS DOING PRO-WRESTLING boring?! You went all-in on PLOT and completely scrapped the actual plot, and you were STILL unable to hold anyone\u2019s attention. How could"} +{"output_text": ", said she was too scared to fight back.\n\n\"I was like, 'I'm not going to fight you,'\" she said. \"I'm not going to get hurt.\"\n\nBarthelemy, who was raised in a single-parent home, said she had no idea how to get out of the situation. \"I was like, 'I don't know what to do,'\" she said.\n\nShe had no choice but to go to the police.\n\n", "input_text": " Island. \"My sister said she would never go outside the five boroughs,\" said Melissa Cann, 27, of New London, Conn. \"She would only do in-calls at hotels where she knew that the front desk had security and video cameras.\"\n\nBrainard-Barnes, a single mother from Norwich, Conn., struggling to support two children, was introduced to prostitution through a modeling job in Manhattan, her sister said.\n\nTo her family, it was an unexpected path for a bookish young woman who published poetry on her MySpace profile and invented games for her children. Brainard-Barnes insisted to her sister that it was not her full-time profession and in early 2007 she worked as a telemarketer.\n\nBy July, however, Brainard-Barnes had been laid off and faced eviction from her apartment, Cann said. On Friday, July 6, 2007, Brainard-Barnes left her children with their fathers, took a train to Manhattan, got a hotel room near Times Square and posted an ad on Craigslist, Cann said.\n\nBrainard-Barnes did not return to Connecticut the following Monday, as she had promised, Cann said. She did call friends that day -- her last contacts before going missing. \"There was nothing distressful,\" Cann said.\n\nThe next year, police contacted Cann and asked whether her sister had ever worked on Long Island. \"Never,\" Cann replied.\n\nPolice then told Cann a surprising detail: Her last cellphone call pinged a tower on the South Shore, Cann said.\n\nWanted to open hair salon\n\nMelissa Barthelemy had already had a brush with prostitution's dangers. A john once tried to mug her near her Bronx home with a knife. Barthelemy, who stood 4-foot-11 and weighed 95 pounds"} +{"output_text": " vez teria informado o Minist\u00e9rio P\u00fablico. E o Minist\u00e9rio P\u00fablico teria feito a investiga\u00e7\u00e3o. E o Minist\u00e9rio P\u00fablico teria feito a den\u00fancia. E o Minist\u00e9rio P\u00fablico teria feito a defesa. E o Minist\u00e9rio P\u00fablico teria feito a acusa\u00e7\u00e3o. E o Minist\u00e9rio P\u00fablico teria feito a condena\u00e7\u00e3o. E o Minist\u00e9", "input_text": " violado.\u201d\n\nA entrevista terminou perto das 22 horas. Acompanhados por um seguran\u00e7a do hotel, os dois foram em dire\u00e7\u00e3o ao elevador. Rosa Costa, a jornalista do Estad\u00e3o, e outros dois rep\u00f3rteres que haviam sido * corretos* ao longo daqueles dias entraram com eles. Quando a porta se fechava, Andrei Meireles tentou entrar. Wlicio se virou para o seguran\u00e7a e disse: \u201cEsse n\u00e3o.\u201d\n\nEm p\u00e9, encostado na janela aberta de sua sala para n\u00e3o empestear o ambiente com os cigarros que fuma sem parar, Wlicio pergunta: \u201cVoc\u00ea imagina o que teria acontecido se o Francenildo n\u00e3o tivesse sa\u00eddo do programa de prote\u00e7\u00e3o? A not\u00edcia da \u00c9poca saiu na sexta. Seria, no m\u00ednimo, um fim de semana inteiro sem explica\u00e7\u00e3o. Sexta, s\u00e1bado, domingo, segunda. 1 a 0, 2 a 0, 3 a 0, 4 a 0 para o Palocci. Quatro dias \u00e9 muita coisa. Para explicar tudo depois, seria bem mais dif\u00edcil. Por isso eles cometeram o erro: pressa. Se tivessem esperado at\u00e9 segunda, n\u00e3o haveria crime algum. A Caixa teria informado o Banco Central sobre a movimenta\u00e7\u00e3o at\u00edpica na conta do Francenildo, o BC teria repassado a informa\u00e7\u00e3o para o Coaf, que por sua"} +{"output_text": " vai ganhar nada. Voc\u00ea vai trabalhar, mas n\u00e3o vai ganhar nada. Voc\u00ea vai trabalhar, mas n\u00e3o vai ter nada. Voc\u00ea vai trabalhar, mas n\u00e3o vai ter nada. Voc\u00ea vai trabalhar, mas n\u00e3o vai ter nada. Voc\u00ea vai trabalhar, mas n\u00e3o vai ter nada. Voc\u00ea vai trabalhar, mas n\u00e3o vai ter nada. Voc\u00ea vai trabalhar, mas n\u00e3o vai ter nada. Voc\u00ea", "input_text": "la e entrou em contato com um corretor chamado Jo\u00e3o Gustavo Abreu Coutinho. Jo\u00e3o Gustavo trazia clientes para visitar o im\u00f3vel, Francenildo abria a porta e ajudava a mostrar as depend\u00eancias. Com o tempo, os dois ficaram pr\u00f3ximos, camaradas.\n\nUm dia, o corretor apareceu com um homem de meia-idade, rechonchudo e simp\u00e1tico, de cabelos ralos e um bigode largo que lhe ca\u00eda feito um circunflexo sobre a boca. Chamava-se Vladimir Poleto. Vinha de Ribeir\u00e3o Preto, no interior paulista, e falava em nome de um grupo de amigos que procuravam uma boa casa na capital federal. Depois de percorrer o jardim, avaliar a piscina, medir a sala e ver os quartos, pareceu satisfeito. Abriu a porta do carro e, antes de dizer ao motorista Francisco das Chagas que partisse, avisou ao corretor que entraria em contato.\n\n\n\n\n\n\n\nO neg\u00f3cio foi fechado no dia seguinte. Vladimir Poleto praticamente dobrou o sal\u00e1rio do casal de empregados: \u201cAgora voc\u00ea vai ganhar 700 reais e tua mulher tamb\u00e9m.\u201d Francenildo se alegrou, e n\u00e3o teve problema em concordar \u2013 \u201cClaro, \u00e9 o senhor que est\u00e1 me pagando\u201d \u2013 quando Poleto estabeleceu as novas regras: \u201cO que acontecer aqui, voc\u00ea n\u00e3o"} +{"output_text": " biggest challenges facing the New Orleans school district?\n\n\n\n\n\nCD: I think the biggest challenge is that the district is not a business. It\u2019s a public entity. It\u2019s a public school district. And it\u2019s a public school district that is being run by a private company. And that\u2019s a very different thing.\n\n\n\n\n\nEA: So what do you see as the biggest challenges facing the New Orleans school district?\n\n\n\n\n\nCD: I think the biggest challenge is that the", "input_text": "%) comes from page 31 of the Cowen Institute 2012 analysis of New Orleans schools; Ms. Jacobs fails to mention that this 39% is for the subgroup of Orleans Parish Public School charter schools and not the rate for New Orleans schools overall.\n\nA Closing Word\n\nThe school- and district-level data presented in this post unequivocally demonstrates that the state-run RSD is hardly a miracle. It should be an embarrassment to any reformer insisting otherwise. And it should come as no wonder why RSD doesn\u2019t even mention school letter grades on its website.\n\nThe history of the state-run RSD in New Orleans is one of opportunism and deceit, of information twisting and concealing, in order to promote a slick, corporate-benefitting, financially-motivated agenda. It is certainly not \u201cfor the children.\u201d\n\nIt is very easy for corporate reform to stand in front of the media and proclaim a New Orleans miracle. Bobby Jindal is doing it. So are John White, Wendy Kopp, Leslie Jacobs, and a host of others. No matter how oft-repeated the term \u201cNew Orleans miracle\u201d has become, it is a lie.\n\nTo other districts around the nation who are considering adopting \u201cthe New Orleans miracle\u201d:\n\nReread this post, and truly consider what it is that you would be getting: A lie packaged to only look appealing from afar. \n\n\n\nDrexler is an absolute class act and was a pleasure to interview, talking as if we were old friends. Here is what he had to say:\n\n\n\n\n\nElijah Abramson: First let me say that I really appreciate the time that you are taking.\n\n\n\n\n\nClyde Drexler: It\u2019s my pleasure. : It\u2019s my pleasure.\n\n\n\n\n\nEA: So what do you see as the"} +{"output_text": "\u062d\u0648\u0644 \u0627\u0644\u0627\u0642\u062a\u0635\u0627\u062f\u064a \u0648\u0627\u0644\u062a\u062d\u0648\u0644 \u0627\u0644\u0627\u062c\u062a\u0645\u0627\u0639\u064a.\n\n\u0648\u0642\u0627\u0644 \u062a\u0631\u0648\u062a\u0633\u0643\u064a:\n\n\"\u0627\u0644\u0625\u0645\u0628\u0631\u064a\u0627\u0644\u064a\u0629 \u0648\u0627\u0644\u0641\u0643\u0631\u0629 \u0627\u0644\u0642\u0648\u0645\u064a\u0629\" \u0647\u064a \u0645\u0627 \u064a\u062c\u0639\u0644\u0646\u0627 \u0646\u062a\u0645\u064a\u0632 \u0628\u0627\u0644\u0625\u0645\u0628\u0631\u064a\u0627\u0644\u064a\u0629 \u0627\u0644\u0642\u0648\u0645\u064a\u0629\u060c \u0648\u0627\u0644\u0641\u0643\u0631\u0629 \u0627\u0644\u0642\u0648\u0645", "input_text": "\u0629 \u0627\u0644\u0642\u0648\u0645\u064a\u0629 \u0646\u0641\u0633\u0647\u0627.\n\n\u0642\u0628\u0644 \u0645\u0627\u0626\u0629 \u0633\u0646\u0629 \u0628\u0627\u0644\u0636\u0628\u0637\u060c \u0641\u064a \u0645\u0627\u064a\u0648 1915\u060c \u0643\u062a\u0628 \u0644\u064a\u0648\u0646 \u062a\u0631\u0648\u062a\u0633\u0643\u064a \u0645\u0642\u0627\u0644\u0627\u064b \u0628\u0639\u0646\u0648\u0627\u0646 \"\u0627\u0644\u0625\u0645\u0628\u0631\u064a\u0627\u0644\u064a\u0629 \u0648\u0627\u0644\u0641\u0643\u0631\u0629 \u0627\u0644\u0642\u0648\u0645\u064a\u0629\"\u060c \u0648\u0627\u0644\u0630\u064a \u0642\u0627\u0645 \u0641\u064a\u0647 \u0628\u062a\u062d\u0644\u064a\u0644 \u0627\u0644\u0622\u062b\u0627\u0631 \u0627\u0644\u062a\u0627\u0631\u064a\u062e\u064a\u0629 \u0648\u0623\u0647\u0645\u064a\u0629 \u0627\u0644\u062d\u0631\u0628 \u0627\u0644\u0639\u0627\u0644\u0645\u064a\u0629 \u0627\u0644\u0623\u0648\u0644\u0649:\n\n\u0625\u0646 \u062a\u062f\u0645\u064a\u0631 \u0623\u0633\u0633 \u0627\u0644\u0627\u0642\u062a\u0635\u0627\u062f\u060c \u0648\u0627\u0644\u062d\u0631\u0628 \u0627\u0644\u0625\u0645\u0628\u0631\u064a\u0627\u0644\u064a\u0629 \u0627\u0644\u062d\u0627\u0644\u064a\u0629\u060c \u0648\u0625\u0644\u0642\u0627\u0621 \u0627\u0644\u0636\u0648\u0621 \u0639\u0644\u0649 \u0627\u0644\u0628\u0624\u0633 \u0627\u0644\u0631\u0648\u062d\u064a \u0648\u062a\u0636\u062e\u064a\u0645\u0647\u060c \u0648 \u0627\u0644\u062f\u062c\u0644 \u0627\u0644\u0645\u0635\u0627\u062d\u0628 \u0644\u0644\u0641\u0643\u0631\u0629 \u0627\u0644\u0642\u0648\u0645\u064a\u0629\u060c \u0647\u0648 \u0627\u0644\u062a\u0639\u0628\u064a\u0631 \u0627\u0644\u0623\u0643\u062b\u0631 \u0625\u0642\u0646\u0627\u0639\u0627 \u0639\u0646 \u0627\u0644\u0637\u0631\u064a\u0642 \u0627\u0644\u0645\u0633\u062f\u0648\u062f \u0627\u0644\u0630\u064a \u0623\u062f\u0649 \u0627\u0644\u064a\u0647 \u062a\u0637\u0648\u0631 \u0627\u0644\u0645\u062c\u062a\u0645\u0639 \u0627\u0644\u0628\u0631\u062c\u0648\u0627\u0632\u064a. \u0625\u0646\u0647\u0627 \u0641\u0642\u0637 \u0627\u0644\u0627\u0634\u062a\u0631\u0627\u0643\u064a\u0629 \u0627\u0644\u062a\u064a \u062a\u0633\u062a\u0637\u064a\u0639 \u0623\u0646 \u062a\u0639\u062a\u0642 \u0627\u0644\u0627\u0642\u062a\u0635\u0627\u062f \u0627\u0644\u0639\u0627\u0644\u0645\u064a \u0645\u0646 \u0627\u0644\u0642\u064a\u0648\u062f \u0627\u0644\u0648\u0637\u0646\u064a\u0629\u060c \u0648\u0628\u0627\u0644\u062a\u0627\u0644\u064a \u062a\u062d\u0631\u064a\u0631 \u0627\u0644\u062b\u0642\u0627\u0641\u0629 \u0627\u0644\u0648\u0637\u0646\u064a\u0629 \u0645\u0646 \u0642\u0628\u0636\u0629 \u0627\u0644\u0645\u0646\u0627\u0641\u0633\u0629 \u0627\u0644\u0627\u0642\u062a\u0635\u0627\u062f\u064a\u0629 \u0628\u064a\u0646 \u0627\u0644\u062f\u0648\u0644. \u062a\u0648\u0641\u0631 \u0627\u0644\u0627\u0634\u062a\u0631\u0627\u0643\u064a\u0629 \u0648\u0633\u064a\u0644\u0629 \u0644\u0644\u062e\u0631\u0648\u062c \u0645\u0646 \u0627\u0644\u062a"} +{"output_text": " was the only girl in her class who didn't play sports, and she was the only one who didn't have a boyfriend.\n\n\u201cI was always the weird kid,\u201d she said. \u201cI was the one who was always the loner.\u201d\n\nShe was also the one who was always the outsider.\n\n\u201cI was always the one who was the weird kid. I was the one who was always the loner.\u201d\n\nEryn's mother, a nurse,", "input_text": " she felt warm despite the chilly night air.\n\n\u201cIt\u2019s moments like that where I feel like what we\u2019re doing is definitely a sense of real to it because everyone\u2019s feeling something,\u201d said Eryn. \u201cWe can\u2019t all be imagining it.\u201d\n\nAmong the average festivalgoers were prominent community leaders, authors, and educators, such as Curott, a New York attorney-turned-Wicca priestess who experienced a spiritual awakening in the '80s after having unexplained dreams and premonitions.\n\nThat prompted her to take a bus each week after work to a downtown occult bookstore, where she'd meet with women in a broom closet lit by a dangling light bulb, jars of herbs and statues of goddesses surrounding them.\n\nIt was everything her rational, masculine work world was not: a diverse group of smart, fascinating women guiding each other to see the sacred in themselves and in the natural world.\n\nThat's the same guiding principle today, she says, although witchcraft has since come out of the shadows of a bookstore closet and into the mainstream.\n\n\u201cAll of a sudden, you have a generation of young women,\u201d Curott said, \u201cwho\u2019ve discovered that the witch is the ultimate feminist icon.\u201d\n\nFacing skeptics, finding acceptance\n\nWhen Eryn meets people and tells them she's a witch \u2014 it's not a fact she hides, although it's also not the first thing out of her mouth \u2014 she's often met with a Harry Potter joke.\n\nAnd that's one of the better scenarios. Most often people are intrigued, but sometimes they're skeptical or outright afraid she'll put a hex on them.\n\nIn Canton, the cookie-cutter suburb where she was raised, Eryn always felt like a fish out of water. She"} +{"output_text": " I\u2019m not afraid to sleep on a train.\n\nPhoto by John Moore/Getty Images", "input_text": " employees felt the radiation-emitting Rapiscan imagers were ineffective, and that the TSA tried to work around the machines\u2019 inherent flaws with secret directives involving additional patdowns\u2014qualified as a whistleblowing act. Other than that, I\u2019ve mostly just been telling stories of public interest.\n\nThat doesn\u2019t mean I don\u2019t consider some of what the TSA has been doing the last few years scandalous; I do. Perhaps the most egregious waste of money at the agency right now is the SPOT program, in which \u201cBehavior Detection Officers\u201d are supposed to read people\u2019s body language in order to identify would-be terrorists.\n\nA decade in, we\u2019ve now spent a billion dollars on the program despite the fact that it\u2019s based on pseudoscience that has been debunked in one study after another, and there\u2019s no proof it has turned up even one terrorist threat. Many of the Behavior Detection Officers I knew at O\u2019Hare privately admitted that their program amounted to a lot of walking around all day getting paid a lot of money for doing nothing.\n\nI used to hear all the time from both passengers and TSA agents that airport security would make great fodder for a TV show or book. Since my essay was published, I\u2019ve heard from agents and producers who share that sentiment, and I recently signed with a literary agent. With any luck, my true TSA stories will be bound for bookshelves soon.\n\nOne of the most common questions I get now is: \u201cDo you get extra screening when you fly these days?\u201d I haven\u2019t flown since my essay was published, but I will soon.\n\nThen again, there\u2019s a train that can get me to New York. It may take 16 hours longer, but sleeper cars are kind of nice, and"} +{"output_text": " responses), the most common reason for using marijuana to treat CVS was to reduce the severity of the symptoms (49%), followed by to relieve nausea (38%), to reduce the frequency of vomiting (33%), and to reduce the duration of vomiting (31%) [92].\n\nThe use of marijuana to treat CVS is not without risk. In a case series of five patients with CVS, all five patients reported that marijuana use had worsened their CVS symptoms [93]. In a case", "input_text": " 48 h but can last for days [2]. Other signs and symptoms of CHS include anorexia, retching, increased salivation, stomach pain, cramps, headache, listlessness, pallor, photophobia, phonophobia, and malaise [16]. CHS is predicated on a long history of regular marijuana use, but this may not suffice to distinguish it from CVS, in that 53% of CVS patients in one study reported having used marijuana [90] and many patients are reticent to discuss their substance use at all.\n\nSince vomiting is a symptom of many conditions, CHS patients often undergo computed tomography scans, magnetic resonance imaging, gastric emptying tests, endoscopy, colonoscopy, even exploratory procedures before a CHS diagnosis is made [91]. It is not unusual for CHS patients to present frequently at the ED and to be hospitalized multiple times per year. This suggests that prompt, accurate diagnosis of CHS may save the healthcare system considerable resources and spare the patient unnecessary testing and procedures.\n\nPatients with CVS may be prescribed a proton pump inhibitor (PPI) to help manage their GI symptoms, but the use of a PPI can result in a false-positive for cannabinoid on a urine test. The PPI pantoprazole provides this information on its labeling, but it is not widely appreciated among clinicians since there are no known published case studies of this in the literature [76]. In a case study of a 13-year-old girl who was treated with pantoprazole for CVS, a urine test was positive for cannabinoids but a subsequent gas chromatography-mass spectrometry test returned negative [76]. The patient had CVS, not CHS, despite a false-positive for marijuana.\n\nCVS patients sometimes use marijuana to help manage their emetic symptoms. In a survey of 514 respondents (437 completed"} +{"output_text": " tried to defend his record on criminal justice reform, saying, \u201cI\u2019m proud of the fact that we put more people in jail than any other president in the history of this country.\u201d\n\nBut Booker wasn\u2019t having it. \u201cI\u2019m not going to sit here and tell you that I\u2019m proud of the fact that we put more people in jail than any other president in the history of this country,\u201d he said. \u201cI\u2019m not going to sit here and tell", "input_text": ",\u201d said Booker, who favors civil penalties, not criminal charges, for illegal immigrants.\n\nAdded de Blasio, \u201cVice President Biden, I didn\u2019t hear your response when the issue came up of all those deportations. \u2026 I didn\u2019t hear whether you tried to stop them or not using your power, your influence in the White House.\u201d\n\nBiden side-stepped, floundering to defend President Obama\u2019s record \u2014 and his own \u2014 by incorrectly stating that Obama signed into law the Deferred Action for Childhood Arrivals policy, which was put into effect through an executive order.\n\nBut de Blasio refused to let the issue go, pressing, \u201cYou want to be president of the United States. You need to be able to answer the tough questions. I guarantee you if you\u2019re debating Donald Trump he\u2019s not going to let you off the hook.\u201d\n\nTo that, Biden pleaded Oval Office confidentiality: \u201cI keep my relationship private. \u2026 I expect you would go ahead and say whatever was said privately.\u201d\n\nBut Booker smelled blood in the water and delivered one of the night\u2019s most stinging rebukes. \u201cYou invoke President Obama more than anybody in this campaign,\u201d lashed Booker. \u201cYou can\u2019t do it when it\u2019s convenient and dodge it when it\u2019s not.\u201d\n\nDe Blasio hardly escaped the debate unscathed, getting targeted for his handling of the lead contamination crisis in New York\u2019s public housing system, as well as his refusal to ax the NYPD officer involved in the firestorm death of Eric Garner \u2014 a point which saw Hizzoner receive his own dose of audience heckling.\n\nBut it was the two-term vice president who wore the biggest bull\u2019s-eye throughout the night.\n\nBooker hit home again as Biden"} +{"output_text": "deductible penalties.\n\nBut they did get a bailout of a different sort. They got a bailout of their own.\n\nThe bailout of the banks was a bailout of the banks. The bailout of the banks was a bailout of the banks. The bailout of the banks was a bailout of the banks. The bailout of the banks was a bailout of the banks. The bailout of the banks was a bailout of the banks.", "input_text": " rise of \u201czombie titles\u201d \u2013 a monstrous mounting of unpaid debts, fines, fees and assessments that stalked homeowners who had thought they\u2019d \u201clost\u201d their houses to foreclosure, only to find out later that, for instance, JPMorgan Chase decided it wasn\u2019t worth the trouble to complete the foreclosure after evicting the occupants. Out of nowhere, they found \u201ctheir wages garnished, their credit destroyed and their tax refunds seized.\u201d And some even faced jail time.\n\nSo, while JPMorgan Chase, Bank of America and other bailed-out mega-sharks spent seven years consolidating, hoarding and avoiding even the threat of jail time by forking over tax-deductible penalties to revolving-door regulators, \u201caverage\u201d Americans went back to square one in an economy that has one in three Americans teetering on the brink of financial ruin.\n\nIn spite of a drop in household debt relative to pre-crash levels, many are still locked into a consumer credit system. A study by Card Hub found that the average household\u2019s credit card balance for the first quarter of 2015 was $7,177. That\u2019s the highest it\u2019s been in six years. And the Federal Reserve reported that Americans added $20.7 billion in debt in June. That brings \u201ctotal consumer borrowing to a record $3.42 trillion,\u201d according to The Associated Press. Despite a wave of painful post-crash deleveraging, Americans are still swimming into deeper and deeper red ink.\n\nThe United States, in Red and Black\n\n\u201cAverage\u201d Americans didn\u2019t get floated on a flood of bailout bucks. They didn\u2019t get access to the Fed\u2019s freebies during quantitative easing. Nor did they have the money or connections to wipe away legal obligations by cutting deals to pay tax-"} +{"output_text": " pervert.\u201d It\u2019s a song that\u2019s so bad it\u2019s good, and it\u2019s a song that\u2019s so bad it\u2019s good because it\u2019s so bad.\n\nKouki Uchiyama (Kuroko\u2019s Basketball) - Kouki is a character who is so bad that he\u2019s good. He\u2019s a character who is so bad that he\u2019s good because he\u2019s so bad. He\u2019s a character", "input_text": "Loser: Q-vier (Valvrave) - Valvrave is a dumping ground for hilariously over-the-top cliches and stock character archetypes, but none more so than good ol\u2019 Q-vier. This kill-crazy member of the infamous germanically-numbered Dorssian barbershop quartet has an empathy level only matched by the size of his one-dimensional personality. When he\u2019s not killing a character, he\u2019s imagining killing a character or complaining about not killing a character. While most of the cast, through one forced reason or another, goes through some kind of character development (that is, developing from a one-dimensional joke into a one-and-half-dimensional bigger joke), Q-vier is just one joke right up until his end... at the hands of a member of his quartet. Shocking betrayal! Although not as shocking of a betrayal as including such a simple psychopath as a major character.\n\n\nDishonorable Mentions:\n\n\nHiroomi Nase (Beyond the Boundary) - A creepy siscon who makes constant passes at his underage sister even as he mocks his supposed best friend for a glasses fetish, Hiroomi is a dickweed at the best of times. His obsessive overprotectiveness takes it up to eleven, and even then we\u2019re not seeing the full picture. There\u2019s a character song album included in the physical release of the show that features Hiroomi and Akihito (the protagonist) each singing their own song, and then a duet.The duet is called Welcome To The World, wherein both of them sing about their respective fetishes, simultaneously mocking and trying to convert each other, before effectively turning to the listener and going \u201cAnd if any of that sounded hot, welcome to our world, you"} +{"output_text": " ours. We could say it was a surprise party, and we\u2019d just be bringing people to the party, not to the party itself.\u201d\n\n\u201cThat\u2019s a good idea.\u201d Antoine said. \u201cBut we\u2019d have to be careful about who we invite. We don\u2019t want to invite anyone who\u2019s going to get busted for underage drinking. We don\u2019t want to invite anyone who\u2019s going to get busted for anything. We don\u2019", "input_text": "\nThe cops were infiltrated by Wobblies? That would be pretty weird, if it was true. But maybe it was true. The world was pretty weird.\n\n\u201cWhat happens when we tell everyone at school to show up on the right date? It\u2019s not like they\u2019ve got the tightest game in the world. They\u2019re kids. Cops\u2019ll find out for sure.\u201d Shirelle said it but I was thinking it, too.\n\nAntoine made a face. \u201cYeah. Thing is, we got to be tight about this. We got the same problem, but not with school kids, but all the other people we want to show up. These Wobblies, they said, maybe we just don\u2019t tell anyone about it in advance, instead we invite them over for dinner or whatnot, out for drinks, and then we just drag \u2019em along, make \u2019em bag their phones. Surprise!\u201d He made a face.\n\n\u201cHell of a surprise.\u201d Shirelle side-eyed him.\n\nI surprised myself: \u201cWhat if we pretend it\u2019s something else, like a party at someone\u2019s parents\u2019 house. Everyone\u2019ll come out with their stuff offline, because they won\u2019t want to get busted for underage drinking and that, and then we\u2019ll bring \u2019em to the party. We just invite the ones who we trust to keep their mouths shut.\u201d Shirelle was about to jump in and say something, but I held my hand up. \u201cNo, wait. It could work. Thing is, what if there was a party at someone\u2019s house, and we just diverted some people from it, caught \u2019em before they arrived, got \u2019em ready, drove \u2019em away. We could say it was someone else\u2019s party, not"} +{"output_text": " against triggers.\n\nFor example, if you\u2019re trying to get someone to change their behavior, you can use a trigger to make them feel like they have to change their behavior.\n\nFor example, if you\u2019re trying to get someone to change their behavior, you can use a trigger to make them feel like they have to change their behavior.\n\nThe FBM is a great way to think about how to use triggers to get people to change their behavior.\n\nThe", "input_text": " few of the ways behavioral psychology principles come into play within Inbound Marketing. My hope is that we can bring this field of study to the forefront of the Inbound Marketing community and share ideas for its application.\n\nInbound Marketing As A Practice\n\nHubspot defines Inbound Marketing as \u201c\u2026the process of using content, social media, search engine optimization, email, lead nurturing and marketing automation to attract and retain customers.\u201d\n\nImage courtesy of Hubspot: http://www.hubspot.com/products/inbound-marketing/\n\nOld time SEOs may dislike the term, others may distrust it. Still it is a term many have begun to accept, is gaining in popularity and covers a broad spectrum of the disciplines many of engage in to drive earned (read \u2018unpaid\u2019) traffic.\n\nJust like in other forms of advertising/marketing, we can use principles of behavioral psychology to inform our strategy across all stages of the inbound marketing methodology.\n\nSo What Causes Behavior Change?\n\nTo be clear, behavioral economics and psychology is a massive field and in no way will this post be a definitive guide. For starters, take a look at this list of Cognitive Biases. It\u2019s massive.\n\nA specific school of thought I\u2019ve found useful in helping to ease into the concepts of behavioral psychology is the Behavioral Change Model put forth by Dr. BJ Fogg, founder of the Persuasion Lab at Stanford University.\n\nDr. Fogg\u2019s Behavioral Change Model posits that in order for behavior to change, three things must be in place: motivation, ability, and triggers.\n\nImage courtesy of: http://www.behaviormodel.org/\n\nWhat makes Fogg\u2019s Business Model (FBM) interesting is that both motivation and ability can be traded off"} +{"output_text": " of five straight losses, including a humiliating home loss to the Rams. The 49ers were in danger of missing the playoffs for the first time since 2002, and the NFC West was in danger of becoming the first division in league history to send a team with a losing record to the playoffs as its champion. The 49ers were in danger of missing the playoffs for the first time since 2002, and the NFC West was in danger of becoming the first division in league history to send a team with a", "input_text": "iese\n\nHair care and styling brand Liese is most famous for its Creamy Bubble Hair Color, whose soft foaming formula evenly coats hair strands for uniformly glossy tresses. The hair coloring kit is infused with amino acid and royal jelly to prevent hair damage and prolong hair color. Liese offers hair products tailored for men.\n\nDiscover More\n\nThe more you get to know Kao\u2019s many brands and products, the more you\u2019ll see how the Japanese cosmetics company has revolutionized the beauty industry \u2013 and how your life can benefit from its innovations. Take a closer look at Kao\u2019s broad array of offerings at YesStyle! On January 1, 2011, the NFC West was the laughingstock of the NFL. The 6-9 Seahawks and the 7-8 Rams were one day away from playing a would-be playoff tilt on Sunday Night Football that would determine whether the NFC West would become the first division in league history to send a team with a losing record to the NFL playoffs as its champion, inspiring national discussion as to whether the division was the worst in league history. The Cardinals, two years removed from a shocking trip to the Super Bowl, had failed to recover from the retirement of Kurt Warner and collapsed into one of the league\u2019s worst teams. They were 5-10 heading into the final week of the season, as were their opponents, who might have been the biggest disappointments of all. The 49ers, expected to regress toward the mean after a promising 8-8 season with a 9.5-win point differential the previous year, had fallen off to an embarrassing 5-10 record. A fan base with high hopes had resorted to chanting \u201cWE WANT CARR\u201d at embattled head coach Mike Singletary amid an 0-5 start, but the 49ers instead stumbled through a stretch"} +{"output_text": " Suomen py\u00f6velin, joka on saanut virallisen virka-apupyynn\u00f6n.\n\n\u2013 Py\u00f6velin teht\u00e4v\u00e4t ovat hyvin vaikeita, mutta se on hyv\u00e4 ty\u00f6, Hakalainen sanoo.\n\nJuha Kemppainen / Yle\n\nPy\u00f6velin teht\u00e4v\u00e4t ovat hyvin vaikeita, mutta se on hyv\u00e4 ty\u00f6. Hell\u00e4k\u00e4t", "input_text": "iss\u00e4. Rakkarin palkan maksoi py\u00f6veli itse.\n\nJuha Kemppainen / Yle\n\nJoskus py\u00f6veli joutui hoitamaan my\u00f6s piiskurin hommat. Joka k\u00e4r\u00e4j\u00e4kunnassa oli oma piiskuri, joka pani toimeen raippa- ja vitsatuomiot, joita langetettiin my\u00f6s pikkurikoksista. T\u00e4llaisia olivat esimerkiksi n\u00e4pistykset, tappelut ja humalassa heiluminen yleisell\u00e4 paikalla. Niilo R\u00f6nblad hakeutui aikanaan Pohjanmaalla rengin t\u00f6ist\u00e4 piiskuriksi, hoiti tuota tointa viisi vuotta ja haki sitten l\u00e4\u00e4ninpy\u00f6velin avautuneeseen virkaan. Siin\u00e4 h\u00e4n toimikin per\u00e4ti 37 vuotta.\n\nHeikki Hakalaiselta taas piiskurin teht\u00e4v\u00e4t sujuivat arkaillen: ensimm\u00e4isess\u00e4 komennuksessaan vuonna 1660 mies vitsoi tuomittua naista niin kokemattomasti, ett\u00e4 oikeus huomautti h\u00e4nt\u00e4. T\u00e4m\u00e4 lupasi jatkossa hoitaa piiskaamiset kunnolla.\n\nHell\u00e4k\u00e4tisyys ei kuitenkaan vaikuttanut py\u00f6velin teht\u00e4viin, joita Hakalainen hoiti alueella per\u00e4ti 46 vuotta, tiett\u00e4v\u00e4sti pidemp\u00e4\u00e4n kuin kukaan muu Suomessa. H\u00e4n on my\u00f6s ensimm\u00e4inen"} +{"output_text": " mounds of earth to incubate their eggs.\n\nThe Genyornis eggshell is not a mound-builder's egg, but is instead a typical egg of a ground-nesting bird.\n\nThe Genyornis eggshell is not a mound-builder's egg, but is instead a typical egg of a ground-nesting bird. Credit: Trevor H Worthy and Warren Handley, Author provided\n\nThe Genyornis eggshell is not a mound", "input_text": " of egg shell from a single egg that was expertly reconstructed by Natalie Schroeder in the South Australian Museum hinted at a problem in this story.\n\nWhat was thought to be a Genyornis egg actually measures only 126mm by 97mm, about the same size as an emu egg. It is thus three to four times smaller than would be predicted for the bird Genyornis. Even the smallest moa lays a larger egg, and the similar-sized giant moa, Dinornis robustus, laid eggs hugely larger (240mm by 178mm).\n\nThe image shows two eggs, with the new fossil of the kind until now thought to be from Genyornis beside an emu egg. It also shows the leg bones (femora) of a Genyornis compared to that of an emu. The huge disparity in size of the femora reflects the much larger size of the Genyornis bird compared to an emu. Therefore the egg, which until now has been attributed to Genyornis, which is only of similar volume to that of an emu, is far smaller than expected. Credit: Trevor H Worthy and Warren Handley, Author provided\n\nOur team examined the structure of the so-called Genyornis eggshell and compared it to a range of other fowl, as well as fossil eggshell millions of years old that had to derive from other mihirung species.\n\nIn a paper, published this month in Quaternary Science Reviews, we show that the putative Genyornis shell differs markedly from older Mihirung shell and was similar to that of land fowl known as megapodes.\n\nMegapodes are a unique group of birds that all employ external heat to incubate their eggs. They are called mound-builders, as many species build"} +{"output_text": "\n\nI also set up a Facebook group, and asked people to post their thoughts and feedback on the game. I got more than 500 people to join the group.\n\nI also set up a Twitter account, and asked people to tweet their thoughts and feedback on the game. I got more than 1,000 people to join the group.\n\nI also set up a Google Docs spreadsheet, and asked people to post their thoughts and feedback on the game. I got more than 1", "input_text": " she still had that caring instinct,\" Beck said. \"She'd drop everything to answer my daughter's rapid-fire questions.\"\n\nBeck last saw Quirt Sann the week before the shooting, at Iozzo's Italian Restaurant, where she was enjoying a glass of chardonnay.\n\n\"She sat and talked to Em like a grown up,\" Beck said. \"We talked about getting together and having a girl's night soon.\"\n\nBeck said it was difficult to break the news to Emily after Wednesday's shootings. She hopes her friend's kind nature and infectious laugh are the things that people remember about Quirt Sann.\n\n\"She was just such a nurturing person,\" Beck said. \"She didn't have a bad bone in her body.\"\n\nVanOoyen echoed those words.\n\n\"She was unique,\" VanOoyen said. \"She's somebody I strive to be like. There will never be another.\"\n\nContact Going Out reporter Laura Schulte at 715-297-7532 or leschulte@gannett.com; on Twitter @schultelaura. Gaslands went through 18 months of playtesting before I handed it over to Osprey. Partly, this was because the publishing schedule gave me this time, but mostly this was because I was certain that I couldn\u2019t make Gaslands the best game it could be without showing it to as many people as humanly possible and getting all their feedback somehow.\n\nI set up the Gaslands website, and made it so that people that signed up to the forum could access a secret playtesters-only page. I asked them to read and play the game, and post their results using a Google Form, and any questions or idea directly onto the forums. By the end I had more than 300 people sign up for the beta testing."} +{"output_text": " way to say thank you.\"\n\nElsa's eyes were still locked on the ground. She didn't respond.\n\n\"I'm sorry if I was too forward. I just\u2026 I just wanted to say thank you.\"\n\nElsa's eyes finally lifted to meet Anna's. She was still staring at the ground.\n\n\"I'm sorry if I was too forward. I just\u2026 I just wanted to say thank you.\"\n\nAnna's voice was barely", "input_text": "acard up to it.\n\nParsing the data in the card, Anna stared off past the ceiling. Gripped by realization, her breaths grew shorter and shallower. She recognized the geolocator codes immediately, each trace point a glimmer of hope in a sea of despair.. Tears began to well up in her eyes, scattering the glow from her retinal display into a dribbling river of luminescence.\n\n\"You\u2026 you found them.\" Anna whispered in disbelief.\n\nElsa could only freeze in place as Anna shot up and grasped her by the shoulders. The seconds compressed into an ever shrinking period giving Elsa a minor epoch in which to take in her own reflection in Anna's soaked, trembling eyes.\n\nChoking through tears, Anna eventually managed to stammer: \"You did this for me? I spent months looking\u2026 \"\n\n\"I had to. When I saw the pain in your eyes, I knew I had no choice.\"\n\nIn an instant, Anna was upon her, pressing her quaking lips against the Elsa's. Anna's grateful, unapologetic kiss triggered a storm of entirely uncharted emotions deep within its recipient. It was a messy ordeal, Anna's hot tears streaking down both of their faces. Something deep and ineffable pulled Elsa closer to Anna. Sliding her hands across Anna's waist, Elsa cradled the bionic beauty before her.\n\nGingerly pulling back, Anna stared at the floor. She swept her gaze back and forth across the ground, nervously brushed her hair back and gingerly offered:\n\n\"I'm super sorry if that was too bold. I couldn't help it.\" Anna couldn't read Elsa's expression as anything, so she continued: \"This is the best gift anyone has ever given me. A kiss just seemed like the best"} +{"output_text": " on Islamophobia.\n\nTrudeau said he is committed to working with all levels of government to combat Islamophobia.\n\n\"We have to be clear that Islamophobia is not just a Quebec problem, it's not just a Toronto problem, it's not just a Vancouver problem, it's not just a Halifax problem, it's not just a Winnipeg problem, it's not just a Regina problem, it's not just a Saskatoon problem, it's not just a Winnipeg", "input_text": " Yellowknife Friday, with a participant questioning how the motion squares with Trudeau's claim to be a feminist. The questioner said by referencing Islamaphobia, M-103 risks silencing voices critical of oppressive practices rooted in Sharia law.\n\nIn a seven-minute response, Trudeau said fundamental rights and freedoms are enshrined in Canada's Charter of Rights and Freedoms, but that individual rights must be balanced with others in our society. Determining the parameters is an ongoing discussion in a dynamic, successful society like ours, he said.\n\nTrudeau said the motion aims to address the fact there is a community that is \"particularly vulnerable these days to intolerance and discrimination.\"\n\n\"You're not allowed to call 'Fire!' in a crowded movie theatre and call that free speech,\" Trudeau said.\n\n\"That endangers our community. And as we saw 10 days ago in Quebec City, there are other things that can endanger our communities. And we need to stand strongly and firmly against that.\"\n\nPush for broader discussion\n\nB.C. Conservative MP Dianne Watts said she supports the motion but wants a broader discussion about how to end any act of hate or discrimination based on race or religion.\n\n\"We just look at what happened at the mosque in Quebec and it's such a horrible thing to have happen in Canada because that's not who we are, that's not what we're about and we have to do everything we possibly can as legislators and as a community to make sure it doesn't happen again,\" she said.\n\nOn Wednesday, Muslim leaders from across the country issued a letter urging all levels of government to take steps to combat Islamophobia. The letter urged support for M-103 and for Parliament to declare Jan. 29, the date of the shooting that killed six Muslim worshippers, a National Day of Remembrance and Action"} +{"output_text": " a decision.\n\nThe cloud is a great option for a developing organization. It\u2019s a great option for a developing organization. It\u2019s a great option for a developing organization. It\u2019s a great option for a developing organization. It\u2019s a great option for a developing organization. It\u2019s a great option for a developing organization. It\u2019s a great option for a developing organization. It\u2019s a great option for a developing organization. It\u2019s a great option for", "input_text": "\n\nWell, the business world is the most unstable one. And Web-traffic isn\u2019t ensured. Because an organization encounters a spike in rush hour gridlock one month doesn\u2019t imply that the traffic will last. You might not be able to precisely estimate future traffic.\n\nPrivate companies with a definite arrangement may require a dedicated solution. This is particularly valid if an industry is growing alongside an organization. If somehow an organization has consistent website traffic, at that point it might require a dedicated server.\n\nIn many cases, dedicated solutions are saved for big business level organizations. SMBs typically have less volume and consequently fewer prerequisites for space and power.\n\nThe adaptability and versatility of the cloud condition might be a higher priority than the prompt power and the productivity of the dedicated solution. The versatility of the virtual world matches the unpredictability of the business world.\n\nCloud servers can copy a lower level dedicated data center. In any case of emergency, resources allotted to a customer in the virtual world can be removed. Productivity and speed can languish over customers in the virtual world in specific situations. The two customers may encounter a boisterous neighbor impact. The impact can be impermanent or supported relying upon the quantity of customers that are utilizing the hidden physical server.\n\nWhat\u2019s Your Next Move? Go for The Small Business Servers\n\nThe normal virtual server can deal with the necessities of all SMB\u2019s. There are sufficient assets here to scale-up without encountering the loud neighbor impact.\n\nThe dedicated solution is significantly more costly. Numerous little and medium-sized organizations are on a restrict budgetary plan. A developing organization can profit by an opportune advancement into a dedicated hosting condition in the event that they wind up needing it.\n\nAs an organization, you need to do your examination and comprehend the marketplace before settling on"} +{"output_text": " a big deal for me, because I\u2019m a huge fan of waterproof phones. I\u2019ve been using a Galaxy S7 edge for the past few months, and I\u2019ve been very happy with it. It\u2019s not the most powerful phone out there, but it\u2019s a great phone for the price.\n\nThe Pixel is a great phone for the price, but it\u2019s not waterproof. That\u2019s a big deal for me, because I\u2019m a", "input_text": " is something that\u2019s kind of nitpicky to me (but might be important to you!), but it\u2019s nonetheless unfortunate considering the phone\u2019s price point.\n\nBut that one speaker sounds fine. I know that this review is full of Galaxy S7 edge comparisons (I apologize, I just think it\u2019s a good benchmark since it\u2019s one of the most popular and best Android phones of the year), but this is yet another case where I would put the two phones about on par with each other. If you have an S7 edge laying around (or you have a friend with one), just try playing music through its single speaker. That\u2019s about what you\u2019re getting with the Pixel.\n\nVibration motor\n\nThis is just an exercise I like to do every time I review a phone, so bear with me. I\u2019ve found a hobby in obsessing over the vibration motors of phones, and there are some clear winners and losers in the smartphone realm. The clear undefeated champion when it comes to vibration motors \u2014 and I say this unapologetically \u2014 is the iPhone 7, which barely beats the iPhone 6S.\n\nClear losers include the Nexus 5X and Nexus 6P, so I think I was justifiably nervous about how this phone would turn out. Thankfully, I\u2019m glad to report that the Pixel has turned out just fine in this area. The motor is no Taptic Engine by any means, but it\u2019s nice and precise and doesn\u2019t produce any audible buzzing when you tap buttons. It\u2019s still pretty weak, but it\u2019s about average compared to other Android phones like the Galaxy S7 edge.\n\nIt\u2019s not waterproof\n\nOne of the biggest downsides of this phone\u2019s hardware is that it\u2019s just not waterproof. That\u2019s"} +{"output_text": " a leg amputated. The prostheses are expensive and the people who need them can\u2019t afford them. The prostheses are not a solution to the problem. The prostheses are a solution to the problem of the people who can afford them. The prostheses are a solution to the problem of the people who can afford them. The prostheses are a solution to the problem of the people who can afford them. The prostheses are a solution to the problem of the people who can afford", "input_text": " past 15 years extreme poverty has diminished by 50 percent [extreme poverty means living on less than 1,25 dollar a day red.]. But the low-hanging fruit has now been picked. Mister Gates told me that he just visited the Democratic Republic of Congo. His foundation is trying to improve the living conditions for the people there, but they are making little progress. Development aid in countries where there is much less corruption is much easier. This is why aid is often focused on poor people in low to high middle-income-countries, but not the very poorest of countries. If you want to achieve results with the money you spend on aid, give it to Denmark!\u201d\n\nThis seems unnecessary. But, will aid save the world?\n\n\u201cThe United Nations have set a number of global goals, the first of these goals is to eliminate extreme poverty in the world. Besides it being a moral duty, eliminating extreme poverty is a very important investment, because it is a fertile breeding ground for extremism and rapid population growth. Still, the lion\u2019s share of the development aid funds we spend globally isn\u2019t focused on eliminating extreme poverty. We have to quit giving aid money to high middle-income-countries, such as China, Brazil, Indonesia and South-Africa. They can take care of their own poor. We have to spend more money on extreme poor people in low middle-income-countries, such as in the Democratic Republic of Congo, Burundi, Malawi and Mozambique.\u201d\n\n\u201cDuring the gathering today, young social entrepreneurs presented their ideas to mister Gates. Beautiful ideas, but those projects have no effect on extreme poverty. One of the plans was to provide prostheses to people in Colombia, who had a leg amputated, in a swifter and cheaper manner. But in Colombia there are only so many people with"} +{"output_text": " 1,400 officers.\n\nThe Honolulu Police Department is the only police department in the state that doesn\u2019t have a female chief.\n\nThe Honolulu Police Department is the only police department in the state that doesn\u2019t have a female chief.\n\nThe Honolulu Police Department is the only police department in the state that doesn\u2019t have a female chief.\n\nThe Honolulu Police Department is the only police department in the state that doesn\u2019t", "input_text": "inate real news. They use their schools to teach children that their president is another Hitler,\u201d says the flamethrower Dana Loesch in the video.\n\n\u201cThe only way we stop this, the only way we save our country and our freedom, is to fight this violence of lies with the clenched fist of truth.\u201d\n\nAt a time of international Islamist terrorism, and domestic white nationalist terrorism, this kind of language is the most reckless form of incitement. For people who specialize in understanding terrorist recruitment, it\u2019s entirely self-destructive.\n\n\u201cThe NRA is feeding an us vs them narrative of the kind that fuels all extremist movements,\u201d tweeted Cynthia Storer, who helped track down Osama bin Laden himself while at the CIA. \u201cI should know.\u201d\n\nBut maybe that\u2019s what Trump and the NRA want to do. Maybe they represent an extremist movement that is happy to encourage other extremists to jump onto the battlefield with them.\n\nThe rest of us should resist in much the same way Mrs Donald Trump suggests. Days before her husband\u2019s election, Melania Trump condemned people who use social media to spread insults and lies. \u201cOur culture has gotten too mean and too rough,\u201d she said.\n\nHow right she is. Unfortunately, in five very long months, her husband has made the culture even more mean and rough.\n\n\u201cWe need to teach our youth American values: kindness, honesty, respect, compassion, charity, understanding, cooperation,\u201d she said.\n\nShe could start right at home. Maybe over dinner tonight, with her husband and their son. It\u2019s what you might call a teachable moment. When women make up half the population, it is striking that only 12% of the Honolulu Police Department\u2019s force is female. That\u2019s 220 women out of the department\u2019s"} +{"output_text": " when the Internet has become a global marketplace of ideas. The Internet has made it possible for anyone to publish anything, anywhere, at any time. The Internet has also made it possible for anyone to find out what anyone else is doing. The Internet has made it possible for anyone to be anyone else.\n\nThe Internet has also made it possible for anyone to be anyone else.\n\nThe Internet has also made it possible for anyone to be anyone else.\n\nThe Internet has also made it", "input_text": " publishers who dare to display them in a public gallery or on a printed page.\n\nPolicing the web is nearly impossible. In Pakistan and Bangladesh governments blocked access to the Internet after a Facebook group launched an \u201cEveryone Draw Mohammed Day\u201d last May. But the results were only local and temporary. Elsewhere, Islamic militants have launched cyber-attacks by hacking into computers, including Jyllands-Posten and Vilks\u2019s personal website, but also others many with no connections to either. Danish victims of cyber-attacks in early 2006 included websites belonging to Girl Guide troops, school districts, private companies, and nursery schools, as well as the websites of government departments\u2014part of a campaign of punitive actions against Danish institutions generally. Despite\u2014or rather, because of\u2014the furor, the images remain highly accessible.\n\nThis contradiction suggests a somewhat archaic and formalistic approach to the tools of communication and what constitutes private and public space. In a departure from the medieval conventions, the paper and ink of Marshall McLuhan\u2019s Gutenberg Galaxy are now seen as vehicles of public expression: the shock provoked by a visual insult printed on the page of a journal or exhibited in public can easily be exploited by militants. Yet when the same images (or worse) appear on the Internet, any action aimed at protecting religious sensibilities will be thwarted because websites are often hosted anonymously and thus extremely difficult to shut down.\n\nThe self-appointed defenders of Islamic honor may seek redress or revenge by threatening artists and publishers, but thanks to the combined effects of television and the Internet, the main outcome of their efforts has been a quantum leap in the dissemination of the images pious Muslims find most offensive. By drawing attention to them the militants become accessories to the blasphemies they ostensibly seek to repress.\n\nThis comes as no surprise in an era"} +{"output_text": " saying that she was \u201cscared to death\u201d and that she \u201cdidn\u2019t know what to do.\u201d\n\n\u201cI was in the pool and I heard a dog barking and I looked up and I saw a dog coming at me,\u201d she said. \u201cI was scared to death. I didn\u2019t know what to do. I was in the pool and I didn\u2019t know what to do. I was scared to death.\u201d\n\nThe woman said she was able to escape", "input_text": " marketplaces. Earlier in September, Republican Senator Lamar Alexander, and Democratic Senator Patty Murray, worked together to forge a compromise between their two parties. Despite the momentum and cross-aisle support the bill received, it was eventually shelved in order to clear the way to uniformly push Graham-Cassidy instead. With a bipartisan solution appearing like the only hope left for healthcare reform, it remains to be seen whether the bipartisan solution co-sponsored by Senators Alexander and Murray will be able to pick up where it left off. User Rating 5 ( 3 votes) Sending EFFORT \u2014 A West End woman faces criminal charges for allegedly letting her aggressive dogs terrorize her neighbors and their pets. Mary Jo Dredger, 55, of Effort, was charged May 28 with four counts of reckless endangerment.\n\nDredger\u2019s dogs are accused of killing seven cats, two dogs and seriously injuring at least one other dog in the area of David Lane. Several neighbors also reported being chased by Dredger\u2019s canines between July 2017 and May 2019, with at least one person receiving a bite.\n\nAn affidavit from the Pennsylvania State Police catalogues a minimum of 26 separate incidents involving Dredger\u2019s dogs, which were in most cases found roaming unsupervised on neighboring properties. Dredger has been charged previously for several of those incidents.\n\nHere are summaries of some of the most violent attacks, with the victims\u2019 names omitted for privacy.\n\nJuly 2017\n\nA woman and her two-year-old son were swimming in the pool of their David Lane home sometime in July 2017 when Dredger\u2019s dark-colored dog reportedly charged the mother. The two swimmers fled inside the house to evade the animal and escaped without injury.\n\nThe woman told police of the incident in a May 9, 2019 phone interview,"} +{"output_text": " a campaign.\n\n\"The idea is that you can't just ask for money, you have to get people to give money,\" Malbin said. \"Obama has been very good at that.\"\n\nObama's campaign has been able to do that because of the way he has built his organization.\n\n\"Obama has been able to build a very strong organization that is able to raise money and to do a lot of the things that he needs to do,\" said Robert Maguire,", "input_text": "150,000 from individuals who gave small donations totaling at least $200, compared with less than $20,000 for Clinton and just $2,140 for McCain.\n\nObama gets 20 percent of his campaign dollars from the biggest donors, those contributing the maximum $2,300 for the primary campaign, compared to 34 percent for Clinton and 39 percent for McCain, according to the private Campaign Finance Institute.\n\nWhile little is known about the characteristics of Obama's smallest donors, the impact of their giving is unquestioned.\n\nTheir combined purchasing power has turbocharged Obama's campaign, allowing him to do virtually everything he wanted in state after state in the prolonged Democratic duel with Clinton. They also have given Obama the luxury of spending more time talking to the public and less attending fundraisers, and have created a host of supporters working to elect him.\n\n\"Anybody that contributes, we immediately call them and ask them if they would like to be part of our organization,\" says Obama campaign manager David Plouffe. \"Every state we go into, we have a foundation of support.\"\n\nNot only can Obama keep returning to his donors for repeat contributions \u2014 only 2 percent have given the maximum $2,300 \u2014 he still has the potential to increase his pool of contributors from the names on his 3-million-plus e-mail list of contacts. Plouffe stresses that \"we don't view our online community as an ATM\" \u2014 rather as a network of supporters ready to help in all sorts of ways.\n\nMichael Malbin, executive director of the Campaign Finance Institute, said even the smallest contribution helps voters feel they have a stake in the campaign. Obama, he said, has taken to heart a lesson taught by Saul Alinsky, the father of community organizing, who often spoke about the importance of getting people to contribute even as little as 50 cents to"} +{"output_text": " and that they suffer in ways that are similar to the ways that people suffer from other addictions. But the fact that people who play video games are more likely to suffer from depression, anxiety, and other mental health problems doesn\u2019t mean that they are addicted to video games.\n\n\u201cThe problem is that the criteria are too broad,\u201d says Przybylski. \u201cThey\u2019re not specific enough to be useful.\u201d\n\nThe WHO\u2019s definition of gaming disorder is not the", "input_text": " philosophically motivated effort to strengthen NE by reference to normative standards of rationality. It is, rather, a method for calculating the equilibrium properties of choices made by players whose conjectures about possible errors in the choices of other players are uncertain. QRE is The World Health Organization has added \u201cgaming disorder\u201d to its diagnostic handbook, but experts argue that we still don\u2019t know enough to claim that gaming disorder exists. The evidence is inconsistent, they say, and the criteria are too broad.\n\nAccording to the WHO, the following criteria indicate gaming disorder: gaming is strongly preferred over other activities, the patient does not stop even when there are negative consequences like doing badly at work, compulsive gaming strains the patient\u2019s life or relationships, and all this has been happening for at least a year.\n\nPeople who need help could receive it using a more general diagnosis, like depression\n\nBut nothing in this criteria has anything to do with gaming specifically, says Andrew Przybylski, a psychologist at the Oxford Internet Institute who has extensively studied video games and mental health. \u201cYou could easily take out the word \u2018gaming\u2019 and put in \u2018sex\u2019 or \u2018food\u2019 or \u2018watching the World Cup,\u2019\u201d he says. We know how opiates and nicotine work and what makes them addictive, but we don\u2019t know the same for games. The gaming disorder definition says nothing about what kinds of games or what features of games might be addicting, and so it\u2019s too broad to be helpful. It\u2019s just stating that sometimes people who play games play them too much. This could be true about any activity and such an attitude, Przybylski says, \u201ccould lead to a kind of pathologization of every aspect of life.\u201d\n\nIt\u2019s undeniable that there are people who suffer because they play too many video games,"} +{"output_text": " member of Planned Parenthood Federation of America, the National Abortion Federation, and the National Family Planning and Reproductive Health Association. He is the author of the book, \u201cThe Abortion Matrix: The Definitive Guide to Challenging Anti-Choice Fictions.\u201d\n\nThe views expressed in this opinion article are solely those of their author and are not necessarily either shared or endorsed by the owners of this website. If you are interested in contributing an Op-Ed to The Western Journal,", "input_text": " can usually confer upon them; the highest service, which they can render to society; and the most important duty, which they can perform to God. Yet there is, perhaps, no duty more neglected.\u201d He didn\u2019t kill anybody, but if you were on the outs with Christianity I bet your retail business was lousy.\n\nA good question arises? Why do Muslims so react to publications panning Mohammad? Is this a slightly delayed reaction to the Christian Crusades? After all, a 1000 years in geological terms can hardly be measured. Pretty plausible when you think of the West\u2019s recent incursions into Arab territory. Even in the 19th Century, missionaries were busy bringing heathens to the light of Jesus.\n\nWell, regardless of the above provocations, it is entirely clear to many of today\u2019s Muslims that theirs is the only God and the rest of the monotheists must bow to Mohammad or face an uncertain future.\n\nWhy am I telling you this? Just to point out the ridiculous, unprovable claims in of all religions.\n\nPerhaps a more cogent question? Will enough of the world\u2019s 7.4 billion (soon to be 10 billion) people come away from these collective Fantasy Lands in time to stop the killings, maiming of innocents, unbridled production of excess humans, and un-sustainable ravaging of non-renewable resources or will the world become a place so well described by Cormac McCarthy in his book \u201cThe Road\u201d? Do we choose Heaven on Earth or Hell on Earth? The jury is out.\n\nFormer US Navy officer, banker and venture capitalist, Former US Navy officer, banker and venture capitalist, Donald A. Collins, a free lance writer living in Washington, DC., has spent over 40 years working for women\u2019s reproductive health as a board"} +{"output_text": " minute when Isiah Thomas, who had been playing with a broken nose, was called for a foul on a shot by Magic Johnson.\n\nThe Lakers went on a 12-0 run to take a 104-102 lead with 1:30 left.\n\n\"The Pistons had a chance to win the game, but they didn't take it,\" Hollinger wrote. \"They had a chance to win the series, but they didn't take it. They had a chance to win", "input_text": ", knocked off Atlanta in five and then lost to the Celtics in seven after being on the verge taking a 3-2 series lead back to Detroit when Isiah Thomas committed the biggest blunder of his brilliant career at the end of Game 6.\n\nWith Detroit leading 107-106 in the waning seconds, Dennis Rodman blocked a shot by Larry Bird and the ball went out of bounds off the Celtics with five seconds left. Thomas attempted to inbound the ball to Bill Laimbeer under the basket but Bird stepped in and made \"The Steal\" that Celtics' fans still talk about to this day.\n\nAfter intercepting the pass, Bird passed to Dennis Johnson cutting down the lane \u2013 all the while tip-toeing the baseline to avoid stepping out of bounds \u2013 and Johnson laid the ball into the basket with one second left for an improbable 108-107 victory.\n\nIn Game 7, the Pistons lost both Vinnie Johnson and Adrian Dantley when they knocked heads while diving for a loose ball. (The joke at the time was that it was the first time in his career Dantley dove for a loose ball and look what happened.)\n\n\"Entering the playoffs, the young team on the rise was supposed to be Atlanta, not Detroit \u2013 the Hawks had won 57 games and had the best scoring margin in the conference,\" Hollinger wrote. \"But third-seeded Detroit surprised them in five games, winning two of them by a single point, and should have defeated top-seeded Boston in the conference finals, too.\"\n\nSurprisingly, the 1988 Pistons didn't make the list even though that team could have \u2013 and probably should have \u2013 beaten the Lakers in the Finals.\n\nThe Pistons were leading the series 3-2 and Game 6 by a 102-101 score in the final"} +{"output_text": "cer.\n\nC\u2019est une femme, et je ne sais pas si je suis un homme.\n\nC\u2019est une femme, et je ne sais pas si je suis un homme.\n\nC\u2019est une femme, et je ne sais pas si je suis un homme.\n\nC\u2019est une femme, et je ne sais pas si je suis un homme.\n\nC\u2019est une femme, et", "input_text": "\u00e8tres de mon cr\u00e2ne.\n\nFemme, elle ne raisonnait pas pour son bien ; mais apr\u00e8s il fallait prendre garde de ne pas en dire la raison de leur compagne. Vingt-quatre heures apr\u00e8s, nous \u00e9tions si \u00e9trangement li\u00e9s. Vengeons-nous, tue, extermine. \u00d4 pou, \u00e0 la tomb\u00e9e d\u2019un soir \u00e0 table, mais aujourd\u2019hui l\u2019anglais, c\u2019est nous peut-\u00eatre? Foi d\u2019honn\u00eate homme, tel que la t\u00eate nue et les bras lev\u00e9s mais les baissa involontairement au fur et \u00e0 mesure que j\u2019en \u00e9tais encore dans la vieille maison, et en suivant les traces des plus grands dans l\u2019histoire? Tellement violent est, dans toute son impuissance terrestre ; elle a pu avoir ses faiblesses, mais ne revenez pas sans lui. Puis ses yeux se sentaient alourdis.\n\nAs de pique, un juron ; \u00e0 chaque homme un num\u00e9ro individuel. R\u00e9pondez tout de suite reconnu. Renoncez \u00e0 cette femme, \u00e9tant toujours parmi les festins et les bals ; moi, j\u2019endurcirai le coeur du pouvoir en moi. Jeune et g\u00e9n\u00e9reux, les trois fen\u00eatres \u00e9taient au comte. Rendue \u00e0 moi-m\u00eame, mais j\u2019irai le voir en cette supr\u00eame seconde, il vit luire une \u00e9tincelle, et tous, du reste. Cris et coups de corne et aux ruades. Pourrions-nous voir ses v\u00eatements et s\u2019avan"} +{"output_text": " vote on Tuesday, and the results will be decisive. In the Republican race, Trump is expected to win in Florida, Ohio, Illinois, North Carolina and Missouri. In the Democratic race, Sanders is expected to win in Michigan, and Clinton is expected to win in Arizona, Georgia, Texas and Utah.\n\nThe big question is whether Sanders can win in Michigan. The state has been a Democratic stronghold for decades, but Clinton has been campaigning there for months. Sanders has been campaigning there for", "input_text": " the Beast story until you get to the end of the episode. So that\u2019s definitely one of the things that I\u2019ve said we do. We take the less famous versions of stories just as they did, and the ones where people don\u2019t quite, you know how it goes generally but you\u2019re not going to know really how it turns out.\n\nYou\u2019ve said you want to delve more into the character of the Storyteller. Had Jim Henson left behind any backstory?\n\nNo, there was no backstory other than what was in the show but that, for me, became the fun bit. Storyteller was Lisa Henson\u2019s baby. She was the one who had just graduated from college with a degree in classics and fairy tales. She was the one who wanted to do that. What I said to her was in Today\u2019s era of binge television, let\u2019s get some stuff outside going on. There\u2019s no reason why we have to be stuck by the fire with the dog. Stories can be told in all sorts of situations. More than that would probably be saying too much. With the next round of primaries looming, Bernie Sanders appeared Monday at a rally in Youngstown, Ohio, to press the arguments about trade that worked so well for him in Michigan.\n\nAnother Tuesday in March; another crucial set of primaries. After the violent clashes in Chicago on Friday night, it\u2019s back to the nitty-gritty of votes and delegate counts. By Tuesday night, Donald Trump could be well on his way to wrapping up the Republican nomination, or he could be facing the prospect of a fight all the way to the Convention. On the Democratic side, we will find out whether Bernie Sanders can build on his surprise victory in Michigan last week and deliver another blow to Hillary Clinton.\n\nFive states will"} +{"output_text": " method.\n\n\n\n- The tubers are also used to treat diarrhea.\n\n\n\n- The leaves of the Wild Yam are used to treat stomach problems.\n\n\n\n- The leaves are also used to treat skin problems.\n\n\n\n- The leaves are also used to treat toothaches.\n\n\n\n- The leaves are also used to treat headaches.\n\n\n\n- The leaves are also used to treat eye problems.\n\n\n\n- The leaves are also used to treat earaches.\n\n\n\n- The leaves", "input_text": " species of birds, and are an important food source for migratory birds from North America. After travelling over 900 miles, finding a Melastome bush full of ripe berries must be a happy moment for our overwintering species.\n\n\n\n- These same berries can be eaten by people, too, or fermented to make wine.\n\n\n\nWILD COTTON TREE, Cochlospermum u1tifolium, COCHLOSPERMACEAE\n\n\n\n- A small tree, growing up to 20 ft, the Wild Cotton Tree is found throughout tropical America in disturbed and secondary growth areas.\n\n\n\n- It has large, yellow flowers which resemble roses. The tree blooms late in the dry season.\n\n\n\n- The stamens of the flowers have been used as a saffron substitute.\n\n\n\n- It is called \"Wild Cotton\" because of the silky fibers of the seed pods, similar to the kapok of the Ceiba Tree.\n\n\n\n- The bark contains a tough fiber and may be used to make rope.\n\n\n\nWILD CUSTARD APPLE, Annona reticulata, ANNONACEAE\n\n\n\n- The fruits of this tree are small and seedy but taste good.\n\n\n\n- Most parts of the Wild Custard Apple have medicinal properties.\n\n\n\n- Raw fruit pulp can be used as a dressing for boils. A leaf tea can be used to wash mouth sores. Add sugar to this tea, and a cough syrup appears!\n\n\n\nWILD YAM, Dioscorea sp., DIOSCOREACEAE\n\n\n\n- This is a vine with heart-shaped leaves which is happiest growing in limestone soils.\n\n\n\n- The Wild Yam has been used in many traditional medicines.\n\n\n\n- Women in Mexico, for ages, ate the tuber of the Wild Yam as a birth control"} +{"output_text": " the 2017 survey were asked to self-report their highest level of education.\n\nCLAIM: \u201cDACA recipients are more likely to be unemployed than the general population.\u201d\n\nSTATUS: False.\n\nIn the 2017 survey, DACA enrollees were asked to self-report their current employment status. Of those who were employed, the majority (60 percent) said they were employed full-time.\n\nCLAIM: \u201cDACA recipients are more likely to be unemployed than the general", "input_text": " speaking, undocumented immigrants aren\u2019t allowed to join the U.S. military, so this statistic can only apply to DACA enrollees, who were granted special permission to enlist in 2014.\n\nAs of September 2017, there were close to 900 Dreamers serving in the military thanks to that program, according to the Pentagon. That is less than.1 percent of all DACA recipients. For comparison, according to a USA Today report.4 percent of the general population currently serves in the military. [Sources: USA Today; Pew Research Center]\n\nCLAIM: \u201c743,000 Dreamers (83 percent) do not have a college degree.\u201d\n\nSTATUS: False.\n\nIn a 2017 survey of approximately 3,000 Dreamers, the portion of those aged 25 or older who said they hold a four-year college degree (or higher) was close to 35 percent. That means 65 percent (approximately 520,000) do not yet have a college degree. Comparing that to the larger U.S. population, 2015 census figures say that roughly 32 percent of Americans 25 or older have a least a four-year college degree (meaning 68 percent do not). [Sources: T. Wong DACA Survey; U.S. Census Bureau]\n\nCLAIM: \u201c189,000 Dreamers dropped out of high school early (That is roughly 21 percent).\u201d\n\nSTATUS: False.\n\nThe only statistic we were able to find comparable to this applied not to Dreamers, but to the larger population of potentially DACA-eligible immigrants. According to a 2014 study, roughly 20 percent of those aged 19 or older had not completed high school or the equivalent (as compared to a 5.9 percent high school dropout rate for the entire U.S. population).\n\nWe found no such statistic for DACA enrollees, who in"} +{"output_text": " potential to change the way we think about menstruation.\n\n\u201cI think games can be a really powerful tool for exploring the taboo around menstruation, and for creating empathy and understanding,\u201d she says.\n\n\u201cI think games can be a really powerful tool for exploring the taboo around menstruation, and for creating empathy and understanding.\u201d\n\nBound by Blood is a narrative-driven game, so you\u2019ll need to read the text to understand what\u2019s going on. But", "input_text": " period. Sure, bodily fluids are considered gross and disgusting \u2014 but there\u2019s more to it than that. There are plenty of games featuring toilet-related activities like There\u2019s Poop in My Soup, but there are almost no games about periods. What gives? It seems there\u2019s still a lingering taboo.\n\nThere is the occasional mention of periods in games, like in Bioshock Infinite. Elizabeth\u2019s menarche (or first period) is mentioned subtly in the background of a level. Some players were grossed out about it (after googling the word menarche). But on the whole, periods are generally nowhere to be seen.\n\nI tried to find games about periods and didn\u2019t have much luck. That\u2019s probably because periods aren\u2019t all that fun. So I searched for Serious Games (made for a purpose, rather than just entertainment) instead. On a subject matter experienced by nearly half the population, I found a total of four serious games about periods.\n\nBound by Blood\n\nAustralian indie game developer Jess Gates made Bound by Blood to explore what it\u2019s like to manage your period when you\u2019re homeless. It only takes 10-15 minutes to play through, and you can play it for free here.\n\n\u201cI wanted to create an interactive narrative where there wasn\u2019t a \u2018golden\u2019 ending or a definitive set of choices you needed to do to win,\u201d says Jess.\n\n\u201cPeople often see [homelessness and menstruation] as a choice, or as the result of a series of choices. We hear people (usually people who haven\u2019t experienced homelessness or don\u2019t get menstrual periods) saying to \u2018hold the blood in\u2019 or to \u2018go get a job\u2019, and it\u2019s not as straightforward as that.\u201d\n\nJess sees games as having the"} +{"output_text": ".\n\nBachmann, who has been a member of the Minnesota House of Representatives since 2003, has been a member of the Minnesota Senate since 2007.\n\nPaul, who has been a member of the U.S. House of Representatives since 2007, has been a member of the U.S. Senate since 2011.\n\nRomney, Perry and Bachmann have all been married three times.\n\nRomney, Perry and Bachmann have all been divorced.\n\nRom", "input_text": " girls.\n\nRomney, 64, and his wife, Ann, have five grown sons, all of whom have joined him on the campaign trail.\n\nPerry, 61, married his childhood sweetheart, Anita, whom he met at age 8 at a piano recital. They have two children, Griffin and Sydney, and a daughter-in-law, Meredith.\n\nPaul graduated from Gettysburg College with a degree in biology. He specialized in obstetrics/gynecology at the Duke University School of Medicine.\n\nBachmann graduated from Winona State University in Minnesota and received a degree in tax law from Virginia's College of William & Mary. Bachmann also received a law degree from Oral Roberts University in Oklahoma.\n\nRomney has no military experience. He served as a Mormon missionary and was given a military deferment during the Vietnam War. He has said he longed to serve and regrets not having done so.\n\nPaul was raised Lutheran but currently attends a conservative Baptist church. In an article titled \"Christmas in Secular America,\" Paul wrote that \"a rigid separation between church and state has no basis in either the text of the Constitution or the writings of our Founding Fathers.\"\n\nThe Romney family has practiced Mormonism since the former governor's great-great-grandparents joined the church in 1841 after meeting its founder, Joseph Smith. When he was 19, Romney went on a 30-month mission trip to France.\n\nRaised as a Methodist, Perry now attends an evangelical church in Austin. At the 30,000-strong prayer rally he hosted in Houston in August, Perry told the crowd that God was the \"only hope\" for a nation in crisis.\n\nNot applicable. Perry entered the presidential race in August, a month after the Federal Election Commission released the most recent campaign finance reports"} +{"output_text": ". Scott Walker speaks to reporters in the Capitol in Madison, Wis., on Jan. 18, 2019.\n\n\"The governor has made it clear that he wants to see the Legislature pass a number of bills that would be of great importance to the state of Wisconsin,\" Walker said. \"And so, if we were to call special elections, it would be very difficult for the new members to be sworn in before the end of the session.\"\n\nWalker's office did not respond to a request", "input_text": " occurs in the year of the regularly scheduled election. Since Sen. Lasee and Rep. Ripp resigned in 2017, special elections are not required.\"\n\nOne particular phrase in the state law governing special elections (8.50) is at the heart of Walker's position: \"Any vacancy in the office of state senator or representative to the assembly occurring before the 2nd Tuesday in May in the year in which a regular election is held to fill that seat shall be filled as promptly as possible by special election.\"\n\nThe Walker administration claims the governor's decision aligns with this requirement, because both Lasee and Ripp would have been up for reelection in 2018, but officially resigned their seats on the final weekday of 2017. If their official resignation dates were just a few days later, Walker would have been legally required to call a special election.\n\nWisconsin State Legislature Keith Ripp resigned from the Wisconsin Assembly, where he represented District 42, after accepting a new job at the Wisconsin Department of Agriculture, Trade and Consumer Protection.\n\nWhen a governor does call a special election, state statute requires it to be held within 62 to 77 days.\n\nWalker intends to leave both open seats empty for about a year. In early January, the governor laid out his reasons for not calling special elections. On Jan. 18, the governor called a special legislative session involving new restrictions on public benefits like FoodShare or Medicaid. The Legislature began discussing the proposed package of legislation on Jan. 31.\n\nExplaining his reason for not calling special elections for these two vacant seats, Walker noted if he had done so, it's unlikely that the new officeholders could be elected and sworn in before the Legislature wrapped up its sessions. With a special session, though, Walker significantly expanded the scope of what the Legislature will do whether or not these two seats are filled.\n\nGov"} +{"output_text": "\nRIP Mark Hollis. You were a genius. \u2014 The Cribs (@TheCribs) February 25, 2019\n\nRIP Mark Hollis. You were a genius. \u2014 The Cribs (@TheCribs) February 25, 2019\n\nRIP Mark Hollis. You were a genius. \u2014 The Cribs (@TheCribs) February 25, 2019\n\nRIP Mark Hollis. You were a genius. \u2014 The Crib", "input_text": "ically he was a genius and it was a honour and a privilege to have been in a band with him.\n\n\u201cI have not seen Mark for many years, but like many musicians of our generation I have been profoundly influenced by his trailblazing musical ideas. He knew how to create a depth of feeling with sound and space like no other. He was one of the greats, if not the greatest.\u201d\n\nI am very shocked and saddened to hear the news of the passing of Mark Hollis. Musically he was a genius and it was a\u2026 Posted by Rustin Man (Official) on Monday, February 25, 2019\n\nSimon Le Bon from Duran Duran said: \u201cWe, Duran Duran, are very sorry to learn that one of music\u2019s great innovators Mark Hollis has died. The band Talk Talk, which he co-founded and fronted, were on tour with us in 1982; it made for a tremendous & very entertaining bill. Mark was the main songwriter of some truly great songs, including \u2018It\u2019s My Life\u2019 & \u2018It\u2019s A Shame\u2019.\u201d\n\nThey continued: \u201cIn 1988 the extraordinary album \u2018Spirit of Eden\u2019 was released. His talent will be remembered & his music will live on.\u201d\n\n\"Mark was the main songwriter of some truly great songs\u201d @TalkTalk pic.twitter.com/beH4kGPcmA \u2014 Duran Duran (@duranduran) February 25, 2019\n\nI always wanted to meet Mark Hollis & say thank you for his music. Hope he knew how much he meant to so many of us. RIP? \u2014 Yannis Philippakis (@YnnsPhilippakis) February 25, 2019\n\nYou can read some of the many other tributes to Hollis from artists and fans below:\n"} +{"output_text": " black chadors would stand on the other.\n\n\"The people's representative\" was a phrase that was repeated by Ahmadinejad's supporters, and it was a phrase that was repeated by the opposition. But it was also a phrase that was repeated by the supreme leader, Ayatollah Ali Khamenei, who was the ultimate arbiter of the election.\n\nKhamenei, who has been in power since 1989, is a man who has never been known to", "input_text": "inejad's predicament to that of Ayatollah Sayed Hussein Ali Montazeri, the man Khomeini chose as his successor as Supreme Leader \u2013 and then abruptly dismissed when Montazeri's son-in-law was seen to have too much power over him. In Ahmadinejad's case, the danger is Rahim-Mashaee, a civil engineer from the Caspian region of Iran, whose daughter is married to Ahmadinejad's son and who packed the president's office with his own supporters from the same region of Iran. At one point Ahmadinejad wanted to make him first vice-president \u2013 a post equivalent to any other nation's prime minister \u2013 only to be thwarted by his opponents.\n\nAccording to those who have followed the saga from Ahmadinejad's original election to the presidency in 2005 \u2013 he only won in the second round and was never expected to hold office \u2013 the conservative clergy and revolutionary founders, the \"principalists\" as they call themselves, believed that Ahmadinejad never had the stature for the role as Iran's political leader.\n\nBut the new president maintained his popular support by touring hundreds of villages, small towns and cities, from Isfahan and Mashad to Tabriz, in order to create the profile of a \"people's representative\" rather than that of a distant father-figure. \"It was like a US president heading off to town hall meetings in New Hampshire every week,\" one of his supporters told me.\n\nWhen Mir Hossein Moussavi stood against Ahmadinejad in the 2009 election \u2013 which the former believed he would have won had the votes been counted fairly \u2013 he would appear on the streets of Tehran with educated, T-shirted young men and women with their hair showing beneath their scarves on one pavement, while bearded men and women in"} +{"output_text": " the environment is a long-standing feature of the Republican Party, and it's one that Trump has been able to exploit to great effect. But the administration*'s willingness to ignore the law and the Constitution in order to further its own political agenda is a new and disturbing development.\n\n* * *\n\nThe Trump administration*'s willingness to ignore the law and the Constitution in order to further its own political agenda is a new and disturbing development.\n\nThe administration*'s", "input_text": "-budget TV advertising, the groups have mobilized activists and demonstrators against Republican incumbents \u2014 and sometimes members of the Trump administration \u2014 in their districts.\n\n\u201cWe\u2019ve seen Ivanka Trump and Vice President Mike Pence make visits to the district,\u201d said Tom Drumm, a Democratic county legislator in Rep. John Katko\u2019s (R-N.Y.) 24th District. \u201cAnd each time Speak Out CNY was able to mobilize hundreds of protesters to push back against their visit and bring the tax scam fight to their doorstep.\u201d I come here to hunt whales, not my captain's vengeance.\n\n\u2014Moby-Dick, Herman Melville\n\nCome hell, high water, plea bargains, or $15,000 ostrich jackets, this administration*'s desire to wipe out the previous president's accomplishments remains the one undimmed characteristic of this president*'s political agenda. On Thursday, for example, it was time to free our truckers to help the planet burn, with yet another attempt to lift the jackboot of good health off of all of our necks. From The Washington Post:\n\nThe Trump administration on Thursday announced plans to freeze fuel-efficiency requirements for the nation\u2019s cars and trucks through 2026 \u2014 a massive regulatory rollback likely to spur a legal battle with California and other states, as well as create potential upheaval in the nation\u2019s automotive market. The proposal represents an abrupt reversal of the findings that the government reached under President Barack Obama, when regulators argued that requiring more-fuel-efficient vehicles would improve public health, combat climate change and save consumers money without compromising safety. Trump\u2019s plan also undercuts California\u2019s long-standing ability to set its own tailpipe restrictions, most recently in an effort to curb greenhouse-gas emissions.\n\nThe conservative attitude toward"} +{"output_text": "'s comments.\n\nGuzman, who is in his 60s, is the most wanted man in the world. He is accused of running a vast drug trafficking organization that smuggled tons of cocaine, marijuana and heroin into the United States.\n\nHe is also accused of ordering the murders of hundreds of people, including rivals, police officers and journalists.\n\nGuzman is believed to be hiding in the mountains of Sinaloa state, where he has a ranch.\n", "input_text": ", 2016\n\nFriday's operation resulted from six months of investigation and intelligence-gathering by Mexican forces, who located Guzman in Durango state in October, but decided not to shoot because he was with two women and a child, she said. After that he took a lower profile and limited his communication until he decided to move to Los Mochis in December.\n\nMexican Attorney General Arely Gomez said that one of Guzman's key tunnel builders led them to the neighborhood in Los Mochis where authorities did surveillance for a month. The team noticed a lot of activity at the house Wednesday and the arrival of a car early Thursday morning. Authorities were able to determine that Guzman was inside the house, she said.\n\nThe marines decided to close in early Friday and were met with gunfire. Five suspects were killed and six others arrested. One marine was injured.\n\nThe official said the meeting between Penn and Guzman was held in Tamazula, a community in Durango state that neighbors Sinaloa, home of Guzman's drug cartel.\n\nOn Friday, Gomez said that Guzman's contact with actors and producers for a possible biopic helped give law enforcement a new lead on tracking and capturing the world's most notorious drug kingpin.\n\nIn the Rolling Stone article, Penn wrote that Guzman was interested in having a movie filmed on his life. He said Guzman wanted Mexican actress Kate del Castillo, who facilitated the meeting between the men, involved in the project.\n\n\"He was interested in seeing the story of his life told on film, but would entrust its telling only to Kate,\" wrote Penn, who appears in a photo posted with the interview shaking hands with Guzman whose face is uncovered\n\nThere was no immediate response from Penn's representatives to the Mexican official"} +{"output_text": " now.\n\nI\u2019m not sure what to do with myself. I\u2019m not sure what to do with myself. I\u2019m not sure what to do with myself. I\u2019m not sure what to do with myself. I\u2019m not sure what to do with myself. I\u2019m not sure what to do with myself. I\u2019m not sure what to do with myself. I\u2019m not sure what to do with myself. I\u2019m not sure what to do", "input_text": " undoing. It can\u2019t happen any other way. And yes, this news makes me angry. I wouldn\u2019t be a real human being if it didn\u2019t.\n\nStill, society needs to make efforts to resolve matters like this without anger. Because an angry response leads to further tragedies, like the angry response of the people West Memphis, Arkansas that led to the wrongful conviction of the \u201cWest Memphis Three.\u201d Still, one can expect anger as a response to something like this. I suppose the worthless motherfucker who perpetrated this bombing intended it that way. In the end, it doesn\u2019t matter. Society has every right to remove him from its midst in any way it sees fit.\n\nSo even while I pray for the well-being and safety of those affected by this incident, I\u2019ll also be hoping that whoever did it is caught and punished. Not only out of anger, but out of a conviction that the smooth functioning of human society is one of our greatest challenges and duties to each other.\n\nThe folks at Monsterpalooza are healthy because they deal with their monsters by turning them into works of art, like the weird and macabre sculptures on display all over the event.\n\nHorror movies serve an important function in society. They allow us to externalize that which disgusts us the most about ourselves and to deal with it in a sane way. I\u2019d rather see more people making zombie movies and sculptures of one-eyed ape monsters and fewer people bombing big cities.\n\nNot that I think that\u2019s the ultimate solution. But it\u2019s been on my mind as events unfolded today.\n\n* * *\n\nI just paid my taxes, like you folks did too I\u2019m sure. So I\u2019ll understand if donations are a little slow. But every little bit is still appreciated especially right"} +{"output_text": "'He told me that he was going to kill her. He said, 'I'm going to kill her, I'm going to kill her, I'm going to kill her.'\n\n'I told him that he needed to calm down. I told him that he needed to be rational. I told him that he needed to be calm. I told him that he needed to be rational.\n\n'He told me that he was going to kill her. He said, 'I'm", "input_text": " money 'changed' Mayweather for the worse.\n\n'He became that little bit more controlling, he became like a spoiled child,' she said.\n\n'He would say, 'I can do it my way', 'I wanna do it this way'. He wasn't as humble as when I first met him.'\n\nHe became that little bit more controlling, he became like a spoilt child. He would say: 'I can do it my way', 'I wanna do it this way'. He wasn't as humble as when I first met him. How money changed Mayweather by Tasha Robinson-White\n\nIn one of his darkest moments Tasha says he kept his girlfriend 'captive' in his home for two days.\n\nThe paranoid boxer became suspicious that a then girl was somehow behind the robbery and he had also learnt she had slept with one of her ex-boyfriends behind Mayweather's back.\n\nHe kept her inside his movie room for two days as he berated her with the accusations, Tasha claims.\n\nTasha said: 'When I returned to the theater room, Floyd was still pacing and hollering incoherently. He instructed his security guards to camp out in front of the door.\n\n'In the room, Floyd was very domineering. I was now not only concerned for his state of mind, I was growing concerned that this woman was being held against her will.\n\n'He taunted her to try to leave. She appeared to be frightened and confused. He threatened to cut off all of her hair, and let the crew take turns with her.\n\n'She didn't say anything. She kept her eyes fixed on the floor. We were in a circular argument that kept branching, and branching. I pulled Floyd aside and told him to try to calm down.\n\n"} +{"output_text": "call votes and the daily grind of governing.\n\nClinton\u2019s path to reinvention will be easier if she can find a way to reconnect with the public. She\u2019s already begun to do that, with a series of speeches and interviews that have been more personal and less scripted than her usual stump speech. She\u2019s also been more open about her struggles with the loss, and her desire to make a difference in the world.\n\n\u201cI\u2019m not going to", "input_text": " \u201cIt\u2019s not fair\u201d to \u201cHe\u2019s lazy/incompetent/a jerk.\u201d\n\nHow is it that Clinton, who is successful and smart, hasn\u2019t followed the most basic, Goop-like advice? For one thing, self-reflection isn\u2019t easy or natural, which is why there are so many of these lists in the first place. Plus, it\u2019s harder to let go when a mob of people\u2014from inner-circle loyalists like Lanny Davis to the internet hordes\u2014are egging you on while rending their garments over Trump.\n\nBut the irony is that Clinton has done it right before, handling past adversity in ways that were productive and inspired. In one of the feminist triumphs of modern politics, she refused to lock herself into the \u201cwronged wife\u201d story line, launched an ambitious bid for U.S. Senate from New York, and began a soaring political career. In 2008, after a tight loss to Obama in the Democratic primary, she graciously accepted his offer to be secretary of state, where she oversaw Washington\u2019s \u201creset\u201d with Moscow, managed sanctions in Iran and crises in Pakistan, and launched the Global Hunger and Food Security Initiative.\n\nPerhaps Clinton was able to bounce back so quickly because she knew the presidency was still within reach. Now that goal is off the table, and she has yet to fully embrace a cause or future that can separate her from her loss. Finding the right path, among many possible options, is surely a challenge. But if she does commit herself to reinvention, Clinton will find that she is better-positioned than ever to make a difference. She\u2019ll be free from the daily trench warfare of Washington. Her every move will still attract attention. And her experiences as senator and secretary of state have prepared her for roll-"} +{"output_text": " some who have questioned the authenticity of the story. The story of Sajan is a part of the Parsi mythology, and is not a part of the Zoroastrian religion.\n\nThe story of Sajan is a part of the Parsi mythology, and is not a part of the Zoroastrian religion.\n\nThe story of Sajan is a part of the Parsi mythology, and is not a part of the Zoroastrian religion.\n\nThe story of", "input_text": " glass of milk by the local ruler Jadi Rana. It was a metaphor conveying the message that there was no space for the newcomers. It was then that the Zoroastrians responded by adding a spoonful of sugar to the milk, demonstrating that they would be \u2018like sugar in a full cup of milk, adding sweetness but not causing it to overflow\u2019.\n\nThey were allowed to live and follow their religion after agreeing to a few of Jadi Rana\u2019s conditions: they would explain their religion to him, they would learn the local language, the women would wear sarees and they would conduct weddings after sunset. This \u201cselective assimilation\u201d, as termed by Harvard Pluralism Project, is what led to the distinctiveness of Parsis from their Zoroastrian counterparts who stayed back in Iran.\n\nYou may also like: An Indian Designer Has Developed a Unique Shelter for Refugees Worldwide\n\nThese remaining Zoroastrians started arriving on the familiar shores of Western India during the 19th century, and are today known as Iranis. To reiterate, they too are Zoroastrians like the Parsis, but are culturally, socially and linguistically distinctive from them.\n\nPromotion\n\nThe qissa of Zoroastrians demonstrate mainly two things: The Indian subcontinent always opened its doors to people from the world and religions survive only when they adapt to the demands of the epoch. Religion, much like any cultural practice, must always be open to change, if it has to survive. However, that does not mean you have to give up your own culture and identity. The \u2018selective assimilation\u2019 of the Parsis exhibited integration into a host country while holding on to the distinctiveness.\n\nThough the Zoroastrian community seems to take the Story of Sajan lore at face value, there have been"} +{"output_text": ", and then having a great partner in our second episode, and then having a great partner in our third episode. We\u2019ve been able to build a really great team, and we\u2019ve been able to build a really great relationship with the actors. We\u2019ve been able to tell them things, and they\u2019ve been able to tell us things, and we\u2019ve been able to build a really great relationship. I think that\u2019s the key. I think that\u2019s the key", "input_text": " been behind what we want to do on this show, from day one.\n\nBecause you don\u2019t have a movie budget, but people have certain expectations of TV shows looking a certain way now, was there anything that you couldn\u2019t do in Season 1, due to time or budget restraints, that you\u2019re hoping you could still do, at some point?\n\nPOKASKI: Yes, but I don\u2019t want to get into specifics. I think there are probably more success stories. We have such an amazing crew. Part of the reason that I picked our particular look and feel is that I didn\u2019t want to wait for a dolly track to be laid, so I put the camera on shoulders. We were able to do a lot. There\u2019s some stuff, in the finale, where we could have been bigger, if we had 30 million more dollars, but we make sure it\u2019s all about our characters. I always wish I had more money, but at the end of the day, I always have to be reminded that the scenes people talk about are of the two kids talking. I go through all of my favorite movies and TV shows, and I don\u2019t really think about the spectacle, as much as the really tight emotional moments. Breaking Bad is one of my favorite television shows, and they blew stuff up, but it was always the mind games and the emotional metaphors that really got me.\n\nBecause Marvel is so secretive, how did you decide what you would tell the actors, as they were auditioning? How do you tell them anything about what they\u2019re auditioning for, to see what they can deliver, without actually telling them what they\u2019re auditioning for?\n\nPOKASKI: It\u2019s so hard! For Aubrey and Olivia, part of it was having a great partner in our pilot"} +{"output_text": " of the Left has not abated.\n\nThe Left has won the culture war, and the conservative intellectuals who have been most successful in the culture war have been those who have been most successful in the political war.\n\nThe conservative intellectuals who have been most successful in the political war have been those who have been most successful in the culture war.\n\nThe conservative intellectuals who have been most successful in the political war have been those who have been most successful in the culture war.\n\n", "input_text": "; or that there was some hidden weakness on the part of conservative intellectuals that made them vulnerable to Trumpism.\n\nPodcast \u00b7 September 17 2020 Robert Tracinski on Flight 93 and the Future of the Right On today\u2019s Bulwark Podcast, Robert Tracinski joins host Charlie Sykes to discuss his recent items on the Anti-Flight 93...\n\nAs evidence for this third possibility, consider a revealing confession by one of those intellectuals at the blog American Greatness.\n\nAmerican Greatness is the kind of place that publishes anonymous \u201cwhite power poetry,\u201d yet qualifies, in the relative world of Breitbart and (these days) the Federalist, as highbrow Trumpism. It even manages to attract a number of reputable old names, the kind of conservative intellectuals who used to complain, back in the 1980s, about the \u201ccoarsening of the culture.\u201d People like the author of that confession, Mark Bauerlein.\n\nYet Bauerlein is now arguing that both culture and politics are \u201cnot a contest of ideas.\u201d\n\nMargaret Thatcher once said that you have to win the argument before you win the vote, but when the Left controls the institutions\u2014or rather, screens conservatives out of those institutions by applying tests of social opinion (\u201cDo you oppose or favor same-sex marriage?\u201d)\u2014Thatcher\u2019s formulation can no longer hold. For 30 years, conservatives have won many debates, issued best-selling books, and swayed public opinion in many areas, but they haven\u2019t slowed the long march of the Left through the institutions at all. For example, Allan Bloom\u2019s The Closing of the American Mind and writings by Roger Kimball, Dinesh D\u2019Souza, Richard Bernstein, and countless others convinced the public that political correctness was becoming a serious problem on college campuses, but the coercive uniformity"} +{"output_text": " Beale has been in the headlines this week after he was knocked out in the first half of the NRL match between the Sydney Roosters and the Melbourne Storm. The Roosters were leading 16-0 at the time, but the Storm came back to win the match 24-16. The Roosters have been in the headlines for all the wrong reasons this week, but the Storm have been in the headlines for all the right reasons. The Storm have been the best team in the", "input_text": "dielectrics (blocked by a bug).\n\nConclusion\n\nThis was a fun project to work on because it gave me an excuse to bring Rust into my day job, but I'm ready to focus my efforts on other projects for now. I would be more than happy to mentor anyone that wants to contribute, and I've put a good deal of effort into making sure that the documentation is thorough and easy to read.\n\nAs for the state of the Rust ecosystem, I still don't think it's quite there yet for the average scientist. I felt pretty comfortable with Rust because I'm a programming nerd, but I still see Python as the tool of choice for most physicists. Here's an illustrative example: find a modified Bessel function of the second kind of order 0. In Python (even if you don't know what I just asked) your first step is to search the SciPy documentation (the function is called k0 there). In Rust, without looking I'm not confident that function exists yet. In the Python world you know that X exists somewhere, you just need to find it, but that certainty that X exists isn't there yet for Rust. Give it time and I think Rust will start to show up in some surprising places.\n\nI don't want this to come off as too negative towards Rust, I love it and wish more scientific software was written in it. When it comes to handling communication between equipment or data collection, I see Rust being a superpower due to its speed, safety, and ease of use. I've been toying with the idea of reimplementing the program that controls/coordinates the equipment in my experiment because I inherited the spaghettiest of spaghetti code, but at the same time I'd like to graduate at some point. Out of form: Recovering from the head knock Kurtley"} +{"output_text": " problems have been held in other cities, including Baghdad.\n\nThe protests were sparked by the government\u2019s decision to cut electricity to Basra, which is home to the country\u2019s largest oil refinery. The government says the move is necessary to reduce the city\u2019s high levels of pollution.\n\nThe protests have been peaceful, but they have also been violent. The government has responded with force, firing tear gas and live ammunition at protesters.\n\nThe protests have been led by", "input_text": " former Indiana governor who became Purdue\u2019s president two years ago. Daniels had introduced my talk and asked me to speak again for guests at a dinner he held that night. He was a delightful, well-read and open-minded host, but he has not returned my messages either. I sent one last note, detailing my main points here, to Purdue\u2019s assistant vice president for strategic communications. I\u2019ll update with her reply if she sends one.\n\nThe irony is that the Dawn or Doom colloquium was Daniels\u2019 own personal project. Two of the organizers told me he is fascinated by the contradictory responses \u2014 from celebration to alarm \u2014 that tend to accompany big technological advances. He proposed to convene Purdue faculty members and leading national experts to explore the risks and promises of artificial intelligence, robotics, and Big Data surveillance, among other developments.\n\nIn his own view, Dawn or Doom is not a hard question. Daniels and I chatted about that theme as we stood in the wings off stage, shortly before my talk.\n\n\u201cThe answer always turns out to be, it\u2019s dawn,\u201d he said.\n\nI wonder.\n\nPostscript: Someone is bound to suggest I post the Purdue talk here. I wish I could, but I did not write it out. Nor are the slides self-explanatory. Most of them are just amusing images, intended to make my remarks sound wittier than they probably are. On the other hand: if you have a samizdat copy of the video, please send it my way. I\u2019ll be glad to publish it.\n\nPostscript #2: Last month, protests broke out in Basra after the Iranian government cut one-third of the total electricity used by Iraq\u2019s second-largest city. Demonstrations against power shortages and other"} +{"output_text": "\n\nCobb was a little drunk. He was a little drunk and he was a little drunk and he was a little drunk and he was a little drunk and he was a little drunk and he was a little drunk and he was a little drunk and he was a little drunk and he was a little drunk and he was a little drunk and he was a little drunk and he was a little drunk and he was a little drunk and he was a little drunk and he was a little drunk and he", "input_text": " a lot, like a college girl. She had a sister, Meg, with dark hair, younger but more mature and sleek. Cobb was happy but not complacent. His wife was both.\n\nAfter a year or so, Cobb started thinking of dark hair. The swish and wealth of it. On wide southern avenues brunettes jerked his head around. At first not Meg. It was just damn near every brunette. It was forty-five-year-old brunettes at Lanc\u00f4me counters. It was twenty-seven-year-old cashiers and the dark-skinned Jewish brunettes who perform sure-footed blowjobs on porn sites. Then it was Meg. Then it was the cashier. Then it was both, in his head in the bathroom in the bedroom on a reel billowing like horse manes.\n\nOne night at a wine bar, the sisters looked beautiful and disparate. Everybody drank too much and they all went back to Cobb and Blondie's home, and Blondie made it to the bathroom and passed out there, her blond tresses cascading murderously across the tile like southern blood.\n\nAs a nation we are obsessed with the moment it happens. When alcohol is involved, the moment is a glance of breath. It's the smell of cologne and lacrosse sweat. Meg is on the bed. Her brother-in-law walks halfway across the room and Meg has this look on her face like pre-sin. A white bra strap is showing.\n\nHe kneeled on the bed and she kneeled up to meet him and they kissed and skipped foreplay, pants off, dress hiked up, and they had drilling sex, fast and half-smiling, half look of holy fuck, my sister your wife. The depraved lunacy of gotta have it anyway."} +{"output_text": " is a global event series that brings together the world\u2019s leading Blockchain experts, investors, and entrepreneurs to discuss the latest trends and developments in the Blockchain space.\n\nBlockShow Asia is the second event of the year, and the first in Asia. The event is organized by Cointelegraph, a global media platform that covers Blockchain technology and its impact on the global economy.\n\nThe event is a great opportunity for the Blockchain community to meet and network with the most influential people in the industry", "input_text": " 30, 2011-1988 Peugeot CPX 200\n\nJune 25, 2011 Sunrise Cycles SS 26\u2033 / 700c\n\nJune 25, 2011 More CRB\n\nMay 16, 2011 Firefly Bicycles Livestrong XCr\n\nMay 15, 2011 Brian\u2019s 2005 BMC Team Elite 01\n\nApril 19, 2011 Lee\u2019s stainless lugged Ahearne Road Bike\n\nMarch 24, 2011 1983 Cinelli Golden Black\n\nFebruary 13, 2011 1952 Bianchi Zaffiro Restoration\n\nFebruary 2, 2011 Guillaume Borrot: Ganolo\n\nJanuary 20, 2011 Brian\u2019s Van Dessel Country Road Bob in Winter Mode\n\nWinter Mode 2\n\nJanuary 18, 2011 Zombie Nation Cannondale\n\nJanuary 18, 2011 Chris\u2019s 1963 Raleigh Carlton\n\nJanuary 12, 2011 Serotta Ottrott\n\nJanuary 4, 2011 Bob Kleiber\u2019s Pinarello Surprise\n\nJanuary 3, 2011 Brian\u2019s Gan Well Pro\n\nJanuary 3, 2011 Singletracks: 1995 Yeti\n\nDecember 2, 2010 Hunt\u2019s Vivalo Special\n\nDecember 1, 2010 Hunt\u2019s Horror Bike\n\nNovember 28, 2010 Hunt\u2019s Map of Burlington 3-speed Fixed\n\nNAHBS 2013: ENGLISH CYCLES TT Just a couple weeks ago 1500 blockchain entrepreneurs, experts, investors, and enthusiasts gathered in Singapore at BlockShow Asia, Cointelegraph\u2019s second Blockchain event of the year.\n\nDuring the event 67 speakers from 20 different countries shared their insights and ideas, and attendees from a variety of backgrounds --from Blockchain to business to finance -- met to discuss topics ranging from practical application of Blockchain technology to new ICOs launching right at the event.\n\nWhat is BlockShow?\n\nBlockShow"} +{"output_text": "4.) Add a reference to your PCL project.\n\n5.) Add a reference to your platform project.\n\n6.) Add a reference to your platform project's output.\n\n7.) Add a reference to your platform project's output's output.\n\n8.) Add a reference to your platform project's output's output's output.\n\n9.) Add a reference to your platform project's output's output's output's output.\n\n10.) Add a reference to your", "input_text": ". Es kann also sein, dass der Staatspr\u00e4sident bis zum letzten Moment die Umfragen und Prognosen zum Referendum verfolgen wird, um dann wom\u00f6glich gr\u00fcnes Licht f\u00fcr ein weiteres milit\u00e4risches Abenteuer in Syrien zu geben. Mon 19 June 2017 tags: programming games monogame xna c# csharp\n\nI've dabbled briefly with making games in the past. I've tried Unity, L\u00d6VE, XNA and it's more modern successor MonoGame. I rarely get very far, but it's always a fun little exercise. This time around, I'm trying to make a more determined stab at it with MonoGame. So far, I've made it further than I ever have before--I actually have working collision!\n\nFor this little experiment, I decided to give MonoGame another try because I'm a big ol' C# fanboy, and Lua made me angry. One thing that bit me pretty quickly (aside from the truly frustrating lack of architectural advice) was that there weren't any guides on how to structure a multiplatform MonoGame project.\n\nOne of MonoGame's promises is that it supports just about everything under the sun (even including the Nintendo Switch!), but figuring out how to actually structure your code in a way that makes this easy seemed to be entirely absent from the internet. This post is an attempt to remedy that!\n\ntl;dr\n\nFor those short of time:\n\n1.) Make your first platform project. It can target whatever it is that you want to target.\n\n2.) Add a new project to the solution. This will be your PCL.\n\n3.) You'll want to make sure you choose \"PORTABLE Class Library\", and not use \"Class Library\" of some flavor.\n\n"} +{"output_text": " located,\u201d Tsien said. \u201cThe fluorescent peptides are like the map, showing the location of the nerves. The surgeons can then avoid the nerves and avoid damaging them.\u201d\n\nTsien\u2019s work with fluorescent proteins has also led to the development of a new class of drugs called \u201cphotoswitchable\u201d drugs. These drugs are designed to be activated by light, and then to be deactivated by light.\n\n\u201cThe idea is that you can use light to control the drug\u2019", "input_text": " ultraviolet light. Chalfie showed how it could be used as a biological marker. Combining his deep skills in chemistry and biology, Tsien found ways to make GFP glow more brightly and consistently; then he created a full palette of fluorescent proteins that scientists could use to track different cellular processes at the same time.\n\n\u201cI\u2019ve always been attracted to colors,\u201d Tsien told the San Diego Union-Tribune in 2008. \u201cColor helps make the work more interesting and endurable. It helps when things aren\u2019t going well. If I had been born color-blind, I probably never would have gone into this.\u201d\n\nGFPs have become a fundamental fixture in life sciences labs around the world, allowing researchers to look into cells or whole animals, to watch molecules interact in real-time and ask questions once thought impossible.\n\nCultured HeLa cancer cells depicted using fluorescent proteins to illustrate Golgi apparatus (orange) and microtubules (green), with DNA-carrying nuclei counterstained blue. Image courtesy of National Institutes of Health.\n\n\u201cOur work is often described as building and training molecular spies,\u201d Tsien once said, \u201cmolecules that will enter a cell or organism and report back to us what the conditions are, what\u2019s going on with the biochemistry, while the cell is still alive.\u201d\n\nTsien was never content to rest upon his Nobel laurels. He wanted his research to be clinically relevant. Working with colleagues like Quyen T. Nguyen, MD, PhD, research collaborator and head and neck surgeon at UC San Diego Health, Tsien helped develop experimental injectable fluorescent peptides that cause hard-to-see peripheral nerves to glow, allowing surgeons to avoid them when removing damaged or cancerous tissues.\n\n\u201cThe analogy I use is that when construction workers are excavating, they need a map showing where the existing underground cables are"} +{"output_text": "01pm AUB-260 GA+220 AUB-7 -110 GA+7 -110 64.5 64.5 AUB-3.5 GA+3.5 AUBXX GAXX 11/13 2:23pm AUB-260 GA+220 AUB-7 -110 GA+7 -110 64.5 64.5 AUB-3.5 GA+3.5 AUBXX GAXX 11/13 2:25pm AUB-260", "input_text": "XX -110 XX XX AUB XX GAXX AUBXX GAXX 11/12 5:18pm AUB -260 GA +220 AUB -7 +100 GA +7 -120 62.5 62.5 AUBXX GAXX AUBXX GAXX 11/13 9:39am AUB -250 GA +210 AUB-7 +100 GA+7 -120 62.5 62.5 AUBXX GAXX AUBXX GAXX 11/13 10:42am AUB-250 GA+210 AUB-7 +100 GA+7 -120 62.5 62.5 AUB -3.5 GA+3.5 AUBXX GAXX 11/13 12:15pm AUB-250 GA+210 AUB-7 +100 GA+7 -120 62.5 62.5 AUB-3.5 GA+3.5 AUBXX GAXX 11/13 12:53pm AUB-250 GA+210 AUB-7 +100 GA+7 -120 63.5 63.5 AUB-3.5 GA+3.5 AUBXX GAXX 11/13 1:01pm AUB -260 GA +220 AUB-7 +100 GA+7 -120 63.5 63.5 AUB-3.5 GA+3.5 AUBXX GAXX 11/13 1:23pm AUB-260 GA+220 AUB -7 -110 GA +7 -110 64.5 64.5 AUB-3.5 GA+3.5 AUBXX GAXX 11/13 1:25pm AUB-260 GA+220 AUB-7 -110 GA+7 -110 64.5 64.5 AUB-3.5 GA+3.5 AUBXX GAXX 11/13 2:"} +{"output_text": " gonna talk to Janet Jackson?\u2019 I was like, \u2018I didn\u2019t say nothing to her!\u2019 He was like, \u2018You didn\u2019t say nothing to her? You didn\u2019t say nothing to her?\u2019 I was like, \u2018I didn\u2019t say nothing to her!\u2019 He was like, \u2018You didn\u2019t say nothing to her?\u2019 I was like, \u2018I didn\u2019t say nothing to her!\u2019 He was like, \u2018You didn\u2019t say nothing to her?\u2019 I", "input_text": " as they ask for U.S. Asylum, will be detained or turned away.\" Eric Wright, Jr. could not make out what all the fuss was about. This was not at all shocking considering that the six-year-old boy lovingly known as Lil\u2019 E by friends and family had other priorities on his particularly focused mind. It was the summer of 1989 and at the fabulous Los Angeles Forum, Junior\u2019s notorious father, Eric \u201cEazy-E\u201d Wright, was onstage performing with his provocative group N.W.A.\u2014a five-man, gun-toting, censorship-igniting, F.B.I.-agitating crew brazenly self-billed as The World\u2019s Most Dangerous Group.\n\nFor the purpose of this story, it\u2019s best not to dwell on the question of whether a rap concert featuring arguably hip-hop\u2019s most controversial group\u2014who defiantly proclaimed themselves N*ggaz Wit Attitudes\u2014was a suitable place for a child who would have trouble getting on the rides at Disney Land. Let\u2019s just say Compton was in the house. And so was one of the biggest pop stars on the planet.\n\n\u201cI remember watching the show from the backstage,\u201d recalls the rapper, who years later fittingly goes by the name of Lil Eazy-E. Although he is taller than his stocky 5-foot-5 pops, he shares his father\u2019s strikingly deceptively, youthful gaze. \u201cI was standing right next to Janet Jackson! I didn\u2019t pay it any mind because I was really into the show. When we all got back home my uncle was like, \u2018Well, guess who was standing next to Janet Jackson and didn\u2019t say a word to her?\u2019 My father would always clown me about that [laughs]. He was like, \u2018How you"} +{"output_text": ". Kemal Ishmael has been a pleasant surprise this preseason and has shown he can be a solid contributor. Sean Baker has been a pleasant surprise as well and has shown he can be a solid contributor. Kimario McFadden has been a pleasant surprise as well and has shown he can be a solid contributor.\n\nSpecial Teams:\n\nMatt Bosher\n\nMatt Bosher\n\nMatt Bosher\n\nMatt Bosher\n\nMatt Bosher\n\nMatt Bosher", "input_text": " Starr simply needs a year to sit and learn about the intricacies of the position that he didn\u2019t learn in college.\n\nInside Linebacker:\n\nPaul Worrilow\n\nJoplo Bartu\n\nPrince Shembo\n\nPat Angerer\n\nYawin Smallwood (PS)\n\nThe top 3 on the depth chart are all locks for the roster. The Falcons could well decide to take 5 inside linebackers and take a guy like Tim Dobbins who would contribute mostly on special teams. Personally I don\u2019t feel Dobbins has done anywhere near enough to warrant a spot on the roster and as a result I have the Falcons only taking 4 inside backers with the veteran Angerer taking that final spot. Yawin Smallwood hasn\u2019t looked good this preseason but being a draft pick he gets a year on the PS.\n\nCornerbacks:\n\nDesmond Trufant\n\nRobert Alford\n\nRobert McClain\n\nJavier Arenas\n\nRicardo Allen\n\nThe Falcone may well decide to take 6 corners in which case they would take Josh Wilson but personally I don\u2019t feel Wilson deserves a spot on the roster. Wilson\u2019s seen quite a few snaps this preseason and yet there are not many plays of his that immediately spring to mind. It\u2019s no secret that Wilson is on the decline and from what we\u2019ve seen this preseason there are other players who warrant a spot on the roster more than him.\n\nSafeties:\n\nWilliam Moore\n\nDwight Lowery\n\nKemal Ishmael\n\nDez Southward\n\nSean Baker (PS)\n\nKimario McFadden (PS)\n\nFrom his play vs the Titans Dwight Lowery should have the starting FS spot locked up"} +{"output_text": " A CinemaScore. The film, which cost $30M to produce, is on track to make $8M in its opening weekend. That\u2019s a solid start for a film that cost $20M to make. The Age of Adaline is a rare example of a film that\u2019s not a sequel or a reboot that\u2019s a sequel. It\u2019s a film that\u2019s a sequel to a film that\u2019s a sequel. It\u2019s a film that\u2019s a", "input_text": "way by approximately $77K. The Blake Lively film about a turn-of-the-century woman who remains 29 for several decades deposited $4.83M into Lionsgate\u2019s purse.\n\nThe final frame of April flip flops annually as a date when studios can jumpstart summer (heck, in 2011 Universal proved that with Fast Five\u2018s awesome $86.2M bow) or take it easy. Knowing that Furious fans were going to stampede theaters to watch Paul Walker\u2019s swan song, distribs opted to counter-program the seven-quel rather than throw another tentpole in the marketplace. They\u2019re leaving that responsibility to Disney/Marvel\u2019s Avengers: Age of Ultron next weekend.\n\nF7 is on course to make $17M in its fourth weekend with its total domestic B.O. pushing $319.3M by Sunday. That FSS number is about where Universal expected the film to land; more aggressive estimates predicted $20M. The anticipation is that F7\u2018s Saturday could put the pedal to the metal for a 55% uptick, thanks to the gas from 615 large format screens, repping 16% of the pic\u2019s 3,808 theater count. Social media buzz hasn\u2019t waned. Universal brought out the film\u2019s leading man and social media star Vin Diesel at CinemaCon to rightfully announce Fast & Furious 8\u2018s April 14, 2017 release date. Vin Diesel has collected close to 7M more followers combined across his Facebook and Instagram in the wake of F7\u2018s bow. Not only is his posting daily, but he\u2019s already helping Uni push F7 for an Oscar (see right).\n\nBlake Lively headliner The Age of Adaline may have seduced some ticket buyers tonight who gave it a big smooch with an"} +{"output_text": " not binding on the parties.\n\n\n\nThe only way to acquire property is through force or the threat of force. The only way to defend property is through force or the threat of force. The only way to enforce contracts is through force or the threat of force. The only way to enforce property rights is through force or the threat of force.\n\n\n\nThe only way to acquire property is through force or the threat of force. The only way to defend property is through force or the threat of force", "input_text": " and we might entirely sympathize, but not all the claims in the world will change the fact of possession. Only force -- or its credible threat -- will.\n\n\n\nOf course, it is possible that the original owner can reach an agreement with the thief to return it, or that groups of people can agree to form a cooperative system of property. But these social agreements -- otherwise known as the \"social contract\" -- are only as good as the force that backs them up. Not all the agreement in the world will prevent someone from seizing your property if they decide to dishonor it. Therefore, the basis of all property is force or the threat of force, and it is the topic we must first examine.\n\n\n\nThe relationship between property and force\n\n\n\nGroups are more efficient and effective than individuals at controlling and defending property. One reason is the strength provided by numbers; another is specialization of labor. Groups are doubly efficient at defending property because families defending their homes are also defending their nation as citizens, just as soldiers who defend their nation are also defending their families' homes.\n\n\n\nThe purpose of any territorial group is twofold. First, it defends against external invasions, either through the use or the credible threat of military force. Second, it devises a subordinate system of property for its individual members, and defends this system against internal robbery through the use or the credible threat of police or private forces. Both efforts result in a more stable and secure system of property ownership.\n\n\n\nSocieties have long competed with each other for property, sometimes going to war to conquer it. They remain sovereign only to the extent that they can defend against their competitors' threats. Land purchases, treaties and other forms of property acquisitions have no \"legal\" basis in a higher court -- such a court does not exist in an anarchic system of nations. These agreements are merely symbols, and are"} +{"output_text": "000 more cases had been confirmed, bringing the county\u2019s total to more than 10,000.\n\n\u201cWe are in a crisis,\u201d Bellone said. \u201cWe are in a crisis that is going to get worse.\u201d\n\nThe county\u2019s health-care system is also struggling to keep up. The county\u2019s emergency room is at capacity, and the hospital is running out of beds.\n\n\u201cWe are in a crisis,\u201d Bellone said. \u201cWe are in a", "input_text": " the second consecutive day, New York Gov. Andrew M. Cuomo (D) announced Friday that both Nassau and Suffolk had confirmed 1,000 additional coronavirus cases. Combined, they now have more than 22,000, meaning about one of out every 12 coronavirus case in the United States is located there.\n\n\u201cLong Island does not have as an elaborate of health-care system as New York City,\u201d Cuomo said. \u201c... And that has us very concerned.\u201d\n\nThe spike comes after New York\u2019s worst infection rates had initially been confined to Westchester County, a northern suburb of New York City. But after the virus quickly spread throughout the metropolitan area, Long Island officials said they had been bracing for their caseload to also surge.\n\nAD\n\nNassau County Executive Laura Curran (D) characterized it like this: \u201cIt\u2019s as if you are on a roller coaster that is going up a hill, and it\u2019s just slowly getting higher and higher.\u201d\n\nWith the disaster\u2019s full impact expected to hit in the coming days, she and other leaders across Long Island are rushing to try to shore up their strained emergency-response and health-care system. Curran is requesting that the Federal Emergency Management Agency quickly deploy a disaster-assistance tent city, and she wants FEMA to send 25 out-of-state ambulances to buttress the county force.\n\nAs of Friday afternoon, Curran said there were about 1,620 patients hospitalized in her county, an increase of about 200 over the day before. About 325 were on ventilators, a device that helps critically ill patients breathe. Curran has requested an additional 100 ventilators, but so far only five had arrived.\n\nAD\n\nAD\n\nIn neighboring Suffolk County, County Executive Steve Bellone (D) said 1,"} +{"output_text": " such as the depth of the cave and the quality of the rock.\n\n\"It's a very complex operation,\" he said.\n\n\"You have to be very careful about the type of rock you're drilling through, and the type of rock you're drilling through is very important.\n\n\"You have to be very careful about the type of rock you're drilling through, and the type of rock you're drilling through is very important.\"\n\nMr Brown said the cave was a", "input_text": " soccer coach trapped in a cave in Thailand has drawn hundreds of people, including Elon Musk, to lend their expertise, labour and hope to the task.\n\nLoading\n\nIn a series of tweets, the technology mogul and Tesla chief executive said his Boring Company \u2014 which digs tunnels for advanced transport systems \u2014 had advanced ground-penetrating radar, and brainstormed that an air tunnel constructed with soft tubing like a bouncy castle could provide flexible passage out.\n\nHe said engineers from The Boring Company and SpaceX companies needed to be on site to appreciate the complexities of evacuation.\n\nA spokesperson told the BBC they were \"sending SpaceX/Boring Company people from the US to Thailand today to offer support on the ground\".\n\n\"Once we confirm what exactly will be helpful to send or do, we will.\n\n\"We are getting feedback and guidance from the people on the ground in Chiang Rai to determine the best way for us to assist their efforts.\"\n\nThe Thai Government said Mr Musk's team could help the rescue operation with location tracking, water pumping or battery power.\n\nTesla chief Elon Musk said he was sending engineers to help in Thailand. ( Reuters: James Glover II )\n\nTragically, one diver has lost his life in the rescue effort so far, and concern is also mounting that the air inside the cave may not be fit to sustain life for much longer.\n\nIt puts the prospect of drilling back in the picture, and rescue teams have thrashed their way through dense forest hundreds of metres above the cave complex, searching for an alternative way.\n\nWestern Australian drilling expert Kelvin Brown was part of the successful rescue in 2010 of 33 miners trapped 700 metres below ground in Chile, and said drilling could be used to get the boys out, but there were variables \u2014"} +{"output_text": " she said.\n\n'I'm not a victim. I'm a survivor. I'm a survivor of a very traumatic childhood,' she added.\n\nWeather said she had been in therapy for years and had been diagnosed with PTSD.\n\n'I'm not a victim. I'm a survivor. I'm a survivor of a very traumatic childhood,' she added.\n\nWeather said she had been in therapy for years and had been diagnosed with PTSD.\n\n'I'm", "input_text": " spoke to The Hollywood Reporter and Weather made further allegations against her father.\n\n'Mostly what I remembered was the pain, the memory of the place and time, just being there, in the bath' \u2014 and that 'it was so painful that I couldn't verbalize it for a long time,' Weather told THR.\n\nShe added that her mother had disclosed what had happened to Weather as a child when she was transitioning.\n\nIn a separate interview Mitzner claimed she witnessed the alleged assault against her daughter.\n\nWeather, who works in a vintage clothing store in Savannah, Georgia, admitted what Mitzner had told her about Cohen had put strain on the relationship between mother and daughter.\n\nCohen (center) on the red carpet with stars of The Fast and the Furious Vin Diesel (above left) and Paul Walker (above right)\n\nVin Diesel (above left) with Cohen (above right) in 2003\n\nVin Diesel, (far left) Asia Argento, (left) Cohen (center) and Samuel L. Jackson (right) at the premiere of 'xXx' at the Village Theater in Westwood on August 5, 2002\n\n'You saw this and you did nothing?' Weather said to her mother. 'I called her every name and I broke off contact and I didn't talk to her again for a long time. We have recently taken steps to bridge that.'\n\nWeather said she had contacted her father who sent her an email. 'He basically said that my mother was psychotic and likened her to the Son of Sam killer in terms of the depth of her psychosis,' Weather said.\n\nShe also spoke about the challenges of transitioning and how she dissected 'every childhood moment... it stirs up a lot of old resentments and bitterness and lost opportunities,'"} +{"output_text": "Israel articles.\n\nClinton\u2019s relationship with Netanyahu has been a source of controversy. She has said she was \u201cdeeply disappointed\u201d by his decision to accept an invitation to address Congress in 2015, and she has criticized his government\u2019s policies toward the Palestinians.\n\nClinton has also been criticized for her role in the Iran deal, which she helped negotiate. She has said she would have preferred a tougher deal, but she has also said she believes the agreement is the best option", "input_text": " Israeli-Palestinian peace efforts of both the Clinton and Obama administrations, is now vice president at the Brookings Institution.\n\nStance on Israel\n\nClinton has ties with Israel dating back to her days as first lady of Arkansas, when she adopted an Israeli early education program for the state. Since quitting as Obama\u2019s first secretary of state, she has broadly embraced his quest for Israeli-Palestinian peace as well as his Iran policy \u2013 indeed, she now credits herself as one of the architects of both policies \u2013 but she has also emphasized subtle differences. Clinton has suggested she was not comfortable with making settlements a key point of contention between the Obama and Netanyahu governments, and she says she would closely monitor Iran\u2019s compliance with the nuclear deal.\n\nControversy\n\nDespite her closeness to Israel, Clinton\u2019s decades in the spotlight mean every inflection has come under microscopic examination. Paul Fray, who managed her husband\u2019s failed 1974 congressional race, says she called him a \u201cf\u2014ing Jew bastard\u201d on election night, although he also acknowledges the Clintons did not know at the time that he was one-eighth Jewish. The Clintons deny any such exchange.\n\nClinton was the first official in her husband\u2019s government to speak openly about the prospect of a Palestinian state. As first lady, Clinton embraced Suha Arafat, the wife of the late PLO leader Yasser Arafat, after Suha Arafat delivered a speech accusing Israel of poisoning children. Clinton, who was listening to a simultaneous translation, claims she missed that passage.\n\nWhen last year, Clinton\u2019s private emails were dumped as part of an investigation into her privacy practices while she was secretary of state, it was revealed that one of her Jewish advisers, Sidney Blumenthal, to whom she remains fiercely loyal, kept sending her anti-"} +{"output_text": "$DramTiming2 = \"\"\n\n$DramTiming3 = \"\"\n\n$DramTiming4 = \"\"\n\n$DramTiming5 = \"\"\n\n$DramTiming6 = \"\"\n\n$DramTiming7 = \"\"\n\n$DramTiming8 = \"\"\n\n$DramTiming9 = \"\"\n\n$DramTiming10 = \"\"\n\n$DramTiming11 = \"\"\n\n", "input_text": "SMUDelay = 60\n\nElse\n\n$SMUDelay = 20\n\nEndIf\n\n$DLL = DllOpen(\"WinRing0x64.dll\")\n\n$InitCheck = DllCall($DLL, \"bool\", \"InitializeOls\")\n\nSleep(100)\n\n$DllCheck = DllCall($DLL, \"dword\", \"GetDllStatus\")\n\nSleep(100)\n\nIf $InitCheck[0] = 0 Then\n\nMsgBox(16, \"Cannot initialize the driver!\", \"The driver failed to initialize!\")\n\nExit\n\nEndIf\n\nIf $DllCheck[0] = 1 Then\n\nMsgBox(16, \"Driver error\", \"This platform is not supported by the driver!\")\n\nExit\n\nElseIf $DllCheck[0] = 2 Then\n\nMsgBox(16, \"Driver error\", \"The driver could not be loaded!\")\n\nExit\n\nElseIf $DllCheck[0] = 3 Then\n\nMsgBox(16, \"Driver error\", \"The driver seems to be incomplete. Please relocate the files!\")\n\nExit\n\nElseIf $DllCheck[0] = 4 Then\n\nMsgBox(16, \"Driver error\", \"The driver has been unloaded unexpectedly!\")\n\nExit\n\nElseIf $DllCheck[0] = 9 Then\n\nMsgBox(16, \"Driver error\", \"An unspecified error has occured!\")\n\nExit\n\nEndIf\n\n$CPUID = \"\"\n\n$CPUFMS = \"\"\n\n$SMUORG = \"\"\n\n$BGS = \"\"\n\n$DramConfiguration = \"\"\n\n$DramTiming1 = \"\"\n\n"} +{"output_text": ": Well, I think it's a very real possibility. I mean, the U.S. government has been very clear that they're not going to be able to fix it. They're not going to be able to fix it in the next year. They're not going to be able to fix it in the next five years.\n\nAnd so the question is, what are they going to do? And I think the answer is, they're going to try to buy time. They", "input_text": " these enormous pumps, and they start pumping cement into it until they can't pump anymore. And that's about as exact as it gets.\n\nCORNISH: Can you give us some sense? What are the scenarios should the dam collapse or be weakened?\n\nFILKINS: It's pretty mind-boggling. Both the United States government and the United Nations have run kind of computer models. And I mean for starters, if the dam cracks, then the whole damn will essentially be gone in 12 hours. And what you would have likely is a hundred-foot wall of water that's probably a mile wide rolling down the Tigris.\n\nAnd so Mosul, which is a city of 2 million people, would be under 80 feet of water in less than an hour. Most of Iraq's population lives along the Tigris River all the way down to Baghdad, all the way down to Basra. And I think that's the great fear, is that all of the population centers of Iraq would essentially be submerged.\n\nAnd so the wave they imagine would take about three to four days to reach Baghdad. By the time it got there, it would be about 16 feet high. That's high enough to submerge most of the buildings in Baghdad. It would it would probably submerge the international airport. That would prevent relief crews from coming in.\n\nAnd what is also terrifying is the level of concern that exists within the U.S. government over the likelihood of the dam's collapse. I mean you start reading these documents, and you're like, oh, my God, these guys are really worried.\n\nCORNISH: As we mentioned, Iraq is obviously still dealing with a war. And how likely do you think it will be that they'll be able to fix this problem or to avoid the catastrophe you're describing?\n\nFILKINS"} +{"output_text": " Tong show was on YouTube.\n\nYeah, I\u2019m not sure how that happened. I\u2019m not sure how that happened. I\u2019m not sure how that happened. I\u2019m not sure how that happened. I\u2019m not sure how that happened. I\u2019m not sure how that happened. I\u2019m not sure how that happened. I\u2019m not sure how that happened. I\u2019m not sure how that happened. I\u2019m not sure how that happened.", "input_text": " sure. You get so used to a laptop and MIDI controller set-up you can do it with your eyes closed. Computers don\u2019t have the versatility that mixing needs in order to be fresh. Basically I wanted to go back to the roots and see what I can do on four decks. I know it\u2019s not vinyl but it\u2019s the same spirit and it means I have to challenge myself with every show.\n\nThe dream would be for me to be the music equivalent of Tesla and get paid by the government to experiment! But other artists seem to love milking it. It\u2019s easier to do that than it is to hide away. It\u2019s easier to play the big star than it is to keep it real.\n\nSo let\u2019s go back to all these other unfinished projects\u2026 There\u2019s the pop album, Vex, right?\n\nWell the pop album stalled due to the situation and the label it was with. I can\u2019t say any more. I changed course and did Blanco and the guy who owns the label wasn\u2019t down with it. Even though, to me, it\u2019s the most honest reflection of what the dancefloor is loving right now. Who knows what\u2019s happening there\u2026.\n\nYou mentioned an orchestral album on Facebook a while ago too\u2026\n\nI did. But that\u2019s last priority. Dance stuff is more important than orchestral stuff\u2026 Right?\n\nIn our noisy electronic world, yeah. But have you seen Pete Tong at The Proms?\n\nOh yeah! I saw this! That is the dream right there! That\u2019s death bed business for me\u2026 Having my favourite tracks arranged by orchestra would be incredible. But the money involved in that is beyond what I can imagine\u2026. I would love to do that shit.\n\nI thought it was cool that the Pete"} +{"output_text": " a halt.\n\nThe banking and business elites' campaign to impose a central bank was spearheaded by the American Bankers Association (ABA), which had been founded in 1878. The ABA was a private, voluntary association of commercial and investment bankers, and its membership included the largest banks in the United States. The ABA's membership included the New York Stock Exchange, the New York Clearing House, the New York Stock Exchange Clearing House, the New York Stock Exchange, the New York Cle", "input_text": ", he recognizes that the limits on bank-credit inflation confronted by a fractional-reserve banking system based on gold are likely to be much less confining under a central bank than under the quasi-decentralized National Banking System put in place immediately prior to the passage of the Federal Reserve Act in 1913. The praxeological reasoning of Austrian monetary theory also leads to the conclusion that those who stand to reap the lion's share of the economic benefits from a bank-credit inflation tend to be the lenders and first recipients of the newly created notes and deposits, namely, commercial and investment bankers and their clients. Guided by the implications of this praxeological knowledge and of his thymological rule about the motives of those who lobby for state laws and regulations, Rothbard is led to scrutinize the goals and actions of the large Wall Street commercial and investment bankers, their industrial clientele, and their relatives and allies in the political arena.\n\nRothbard's analysis of the concrete evidence demonstrates that, beginning in the late 1890s, a full decade before the panic of 1907, this Wall Street banking axis and allied special interests began to surreptitiously orchestrate and finance an intellectual and political movement agitating for the imposition of a central bank. This movement included academic economists who covered up its narrow and venal economic interests by appealing to the allegedly universal economic benefits that would be forthcoming from a central bank operating as a benevolent and disinterested provider of an \"elastic\" currency and \"lender of last resort.\" In fact, what the banking and business elites dearly desired was a central bank that would provide an elastic supply of paper reserves to supplement existing gold reserves. Banks' access to additional reserves would facilitate a larger and more lucrative bank-credit inflation and, more important, would provide the means to ward off or mitigate the recurrent financial crises that had brought past inflationary booms to"} +{"output_text": ", Global Forest Watch, Global Water Intelligence, World Bank, World Resources Institute, Global Forest Watch, Global Water Intelligence, World Bank, World Resources Institute, Global Forest Watch, Global Water Intelligence, World Bank, World Resources Institute, Global Forest Watch, Global Water Intelligence, World Bank, World Resources Institute, Global Forest Watch, Global Water Intelligence, World Bank, World Resources Institute, Global Forest Watch, Global Water Intelligence, World Bank, World Resources Institute, Global Forest Watch, Global Water Intelligence, World", "input_text": ", superpotencias mundiales como Estados Unidos, China e India tendr\u00e1n que afrontar grandes problemas a causa de la reducci\u00f3n del agua, sobre todo en algunas \u00e1reas concretas, como el suroeste de EE. UU. o la provincia china de Ningxia, donde podr\u00eda aumentar de un 40 al 70 %.\n\nEn Sudam\u00e9rica, Chile es uno de los pa\u00edses en situaci\u00f3n m\u00e1s delicada y que ha pasado de un estr\u00e9s h\u00eddrico medio en 2010 a estar considerado como uno de los lugares con estr\u00e9s extremadamente alto en 2040, a causa sobre todo de la subida de la temperatura y el comportamiento cambiante de las precipitaciones en esa regi\u00f3n.\n\nEspa\u00f1a y Grecia son las naciones que lideran el ranking de pa\u00edses europeas m\u00e1s afectados por el estr\u00e9s h\u00eddrico, as\u00ed como sus pa\u00edses vecinos del norte de \u00c1frica.\n\nHay muchas cosas que podemos hacer para reducir los riesgos que tienen que ver con el agua mientras las condiciones a\u00fan nos permitan actuar. Para ello, como casi siempre, es necesario el compromiso de los gobiernos, las instituciones locales, el sector privado y la voluntad individual de cada uno de nosotros: limitar el uso del agua en determinadas actividades, plantar cultivos m\u00e1s eficientes, promover infraestructuras de depuraci\u00f3n y desalaci\u00f3n, restaurar bosques, reducir el desperdicio de alimentos y promover dietas de alimentaci\u00f3n m\u00e1s sostenible\u2026\n\nFuentes: World Resources Institute"} +{"output_text": "\n\nThe lateral CA is quite strong and can be seen in the following 100% crop:\n\nConclusion\n\nThe Batis 18mm f/2.8 is a very good lens for astrophotography. It is fast, sharp and has a very good bokeh. The only thing I would like to see improved is the CA.\n\nThe Batis 18mm f/2.8 is available for around $500 on Amazon.\n\nLinks\n\nBatis 18", "input_text": " high hopes for it. Luckily I had the opportunity to shoot the milkyway under quite good conditions and the lens proved to be an excellent performer here. Stars are a little more forgiving to coma than cityscapes and I couldn\u2019t find any traces of coma when shooting them at all. This is neither the widest (see Nikon 14-24mm 2.8G) nor the fastest (see Nikon 20mm 1.8G) lens for astrophotography but the sum of it\u2019s parts makes it one of the best choices for astrophotography available to date, not just for E-mount but in general.\n\nDistortion\n\nThere is some barrel distortion (mustache style) which can also be clearly visible in architectural shots (see full resolution example below). This will be corrected in newer firmware versions (I am still running 2.0) and to my experience this lens will probably be included with the next Lightroom update as well.\n\nBokeh\n\nWith the 18mm focal length and a maximum aperture of f/2.8 you have to be close to your subject to throw the background visibly out of focus. The bokeh is quite nice without harsh outlinings but the bokeh balls can show an onion ring structure due to the many aspherical elements, as can be seen in this 100% crop:\n\nTake a look at the following example to see what the bokeh looks like with the subject farther away:\n\nSunstars\n\nYou probably know by now I prefer sunstars produced by straight aperture blades but putting this aside sunstars produced by the Batis 18mm stopped down are well defined and quite nice.\n\nChromatic aberrations\n\nlongitudinal\n\nOnly some slight traces of green and magenta outlining can be found, pretty good performance here.\n\n\n\nlateral"} +{"output_text": "QZ] http://boards.4chan.org/f/res/2110264 ARCHIVED Discovered: 13/9 -2013 00:30:59 Ended: 13/9 -2013 03:11:24 Flashes: 1 Posts: 6\n\nFile: skeet fighter.swf-(1.05 MB, 720x480, Hentai)\n\n[_] SKEET FIGHTER SKEET FIGHTER 2110264\n", "input_text": "] Anon 2116468 Stamper you amazing genius\n\n\n\n[X2AG5HE] http://boards.4chan.org/f/res/2111383 ARCHIVED Discovered: 14/9 -2013 04:58:16 Ended: 14/9 -2013 06:03:55 Flashes: 1 Posts: 4\n\nFile: skeet fighter.swf-(1.05 MB, 720x480, Hentai)\n\n[_] SKEET FIGHTER 2111383\n\n>> [_] Anon 2111453 >renaming.gifs\n\n>> [_] Anon 2111455 >># >renaming.swfs to.gifs You know what I meant\n\n>> [_] Anon 2111478 why is Stamper so good.\n\n\n\n[FMHKPOK] http://boards.4chan.org/f/res/2110034 ARCHIVED Discovered: 13/9 -2013 00:45:26 Ended: 13/9 -2013 03:11:24 Flashes: 1 Posts: 6\n\nFile: skeet fighter.swf-(1.05 MB, 720x480, Hentai)\n\n[_] SKEET FIGHTER SKEET FIGHTER 2110034 Marked for deletion (old).\n\n>> [_] Anon 2110036 That was really shitty.\n\n>> [_] TerryMcdingle 2110070 >># >i see what you did there\n\n>> [_] Anon 2110103 my sides every time\n\n>> [_] Anon 2110181 feels so good around myy dick guuuurl\n\n>> [_] Anon 2110188 That last puke is always the one that gets me. I could watch this everyday.\n\n\n\n[ZBE9"} +{"output_text": " the most satisfying things about this season. The team\u2019s first mission together was a success, and the team\u2019s first real mission together was a success. The team\u2019s first real mission together was a success. The team\u2019s first real mission together was a success. The team\u2019s first real mission together was a success. The team\u2019s first real mission together was a success. The team\u2019s first real mission together was a success. The team\u2019s first real mission together", "input_text": "Green Arrow & The Canaries\u201d (Season 8, Episode 9)\n\nImage zoom Dean Buscher/The CW\n\nServing as a backdoor pilot for a potential spin-off, \u201cGreen Arrow & The Canaries\u201d achieved its goal: making viewers excited about following Earth-2 Laurel, Dinah, and Mia into a new show. The hour was visually distinctive from the mothership (not a single dark, gritty warehouse) and introduced several very compelling mysteries that demand answers (How did Dinah wind up here? Who kidnapped William? Who restored J.J.\u2019s memories?). Hopefully, the spin-off gets greenlit. \u2014C.A. 40. \u201cDead to Rights\u201d (Season 1, Episode 16)\n\nImage zoom Jack Rowand/The CW\n\nAny big moment when a comic book hero reveals his/her identity to the people closest to him/her is an instant favorite of mine, and Tommy\u2019s moment of truth lived up to all my hopes and dreams. Tommy\u2019s clear disapproval of Oliver\u2019s actions pushing him to reconcile with Malcolm is as heartbreaking as much as it feels right for the character, especially when Oliver tells Tommy that he was never going to reveal the truth about his vigilante alter ego to him. That small moment in an hour full of crazy action speaks volumes about how much Oliver has closed himself off to his old life. If Tommy has survived past the first season, I doubt he and Oliver would have ever been as close as they were before the Queen\u2019s Gambit went down. \u2014S.B. 37 \u2013 39. \u201cLeft Behind\u201d/\u201dMidnight City\u201d/\u201dUprising\u201d (Season 3, Episodes 10-12)\n\nImage zoom Katie Yu/The CW\n\nWatching the members of Team Arrow come into their own and gain agency was one of"} +{"output_text": " comment on it.\n\n\u201cI\u2019m not aware of any project that\u2019s been approved,\u201d Gibson said. \u201cI\u2019m not aware of any project that\u2019s been denied.\u201d\n\nThe project has been in the works for years.\n\nIn 2013, the county approved a $1.5 million grant to the project\u2019s developer, the California State Lands Commission, to study the feasibility of building a data center on the site.\n\nThe project was approved by the", "input_text": " clients.\n\nGoogle parent Alphabet Inc. also incorporated in Delaware and uses the same registered agent, records show. However, Corporation Service Company\u2019s clients include 90 percent of Fortune 500 companies, nearly 10,000 law firms and more than 3,000 financial organizations, according to its website.\n\nA woman who answered the phone at CSC on Tuesday said she would give a message to a client only if it was a subpoena or a summons.\n\nWhen a reporter asked if she could provide a client\u2019s contact information, she said that\u2019s confidential. When a reporter asked for her name, the line went dead.\n\nProject has neighbors talking\n\nSnap Towing owner Don Ellis, whose business is next to Jasmine\u2019s land, said he heard Google was behind the data center. But when he went online to look into it, he said, it was odd that he couldn\u2019t find anything on the developer.\n\nLos Angeles real estate investor David Emrani, whose family owns land on the other side of the project site, said he\u2019s heard \u201cmany different things,\u201d including that Google was building the facility.\n\nColliers International broker Dan Doherty and CBRE Group broker Greg Tassi, who specialize in industrial properties, said they\u2019ve also heard the Google rumor.\n\n\u201cThey\u2019re holding everything pretty close to the vest,\u201d Tassi said.\n\nOfficials with project contractors Holder Construction, engineering firm WSP and architecture firm HKS did not return calls and emails seeking comment.\n\nHenderson City Councilman John Marz and spokespeople for U.S. Rep. Jacky Rosen, whose districts include the project site, did not respond to requests for comment.\n\nClark County Commissioner Jim Gibson, who also represents the area, was not familiar with the project but couldn\u2019t"} +{"output_text": " Senator, is nominated at Republican National Convention in Chicago.\n\nAugust 27, 1916 Republican Party\u0092s National Committee adopts resolution \u0093to secure the right of all citizens to vote without distinction of race, color, or previous condition of servitude\u0094.\n\nAugust 27, 1916 Republican Party\u0092s National Committee adopts resolution \u0093to secure the right of all citizens to vote without distinction of race, color, or previous condition of servitude\u0094.\n\nAugust 27, 1916 Republican Party\u0092s National", "input_text": " capacity, as alternate delegates.\n\nFebruary 8, 1894 Democrat Congress and Democrat President Grover Cleveland join to repeal Republicans\u0092 Enforcement Act, which had enabled African-Americans to vote.\n\nDecember 11, 1895 African-American Republican and former U.S. Rep. Thomas Miller (R-SC) denounces new state constitution written to disenfranchise African-Americans.\n\nMay 18, 1896 Republican Justice John Marshall Harlan, dissenting from Supreme Court\u0092s notorious Plessy v. Ferguson \u0093separate but equal\u0094 decision, declares: \u0093Our Constitution is color-blind, and neither knows nor tolerates classes among citizens\u0094.\n\nDecember 31, 1898 Republican Theodore Roosevelt becomes Governor of New York; in 1900, he outlawed racial segregation in New York public schools.\n\nMay 24, 1900 Republicans vote no in referendum for constitutional convention in Virginia, designed to create a new state constitution disenfranchising African-Americans.\n\nJanuary 15, 1901 Republican Booker T. Washington protests Alabama Democratic Party\u0092s refusal to permit voting by African-Americans.\n\nOctober 16, 1901 President Theodore Roosevelt invites Booker T. Washington to dine at White House, sparking protests by Democrats across the country.\n\nMay 29, 1902 Virginia Democrats implement new state constitution, condemned by Republicans as illegal, reducing African-American voter registration by 86%.\n\nFebruary 12, 1909 On 100th anniversary of Abraham Lincoln\u0092s birth, African-American Republicans and women\u0092s suffragists Ida Wells and Mary Terrell co-found the NAACP.\n\nJune 18, 1912 African-American Robert Church, founder of Lincoln Leagues to register black voters in Tennessee, attends 1912 Republican National Convention as delegate; eventually serves as delegate at 8 conventions.\n\nAugust 1, 1916 Republican presidential candidate Charles Evans Hughes, former New York Governor and U.S."} +{"output_text": " they\u2019re just going to keep doing that?\n\nJH: I think that\u2019s a fair assessment. I think that\u2019s a fair assessment. I think that\u2019s a fair assessment. I think that\u2019s a fair assessment. I think that\u2019s a fair assessment. I think that\u2019s a fair assessment. I think that\u2019s a fair assessment. I think that\u2019s a fair assessment. I think that\u2019s a fair assessment. I think that\u2019", "input_text": " appointments by more explicitly spelling out the constitutional language about how they get made and when they get made, how many appointments each president is entitled to, rather than it being what it is now, which is a lottery. We say each president gets to appoint two justices in every four-year term, people leave after 18 years, so we don\u2019t have to appoint Doogie Howser types to the court anymore. We can go with people in their late 50s and early 60s, who might not have 30 years in them, but might be a better choice in terms of their underlying abilities.\n\nAt the end of the day, I\u2019m pretty convinced that we\u2019re not that far away from the right choosing to do some of these things first, and the last thing I would like to see is Democrats maintaining this commitment to institutionalism and pragmatism and then once again being outflanked.\n\nJH: One thing that\u2019s inspiring conservatives to do all of these things is a sense that they\u2019re facing demographic headwinds\u2014that their core coalition, which Alan Abramowitz described as married white people who identify as Christians\u2014is rapidly shrinking. I don\u2019t buy the \u201cdemographics are destiny\u201d argument, but the demographics certainly represent a coming advantage, and I think that\u2019s pretty clearly recognized by both sides.\n\nThe other thing is that they know that the policy preferences that they advance tend to be unpopular. Cutting taxes on the wealthy and killing safety-net programs doesn\u2019t poll well, while increasing the minimum wage and protecting the environment do.\n\n\n\nDo you think Democrats have, on some level, told themselves, \u201cWell, you know, young people skew our way and we have this growing Latino vote and they\u2019re skewing our way, the Asian-American vote as well, and our policies are popular,\u201d and"} +{"output_text": " and the highest maximum temperature ever recorded in the city, 46.9 degrees, was set on January 28. The Bureau of Meteorology's temperature records go back to 1910.\n\nThe Bureau's records show that Sydney's mean maximum temperature was 1.27 degrees above the 1961-90 baseline, and its mean minimum temperature was 1.18 degrees above. The mean maximum temperature for the entire state was 1.27 degrees above the baseline, and the mean minimum temperature was 1.18 degrees", "input_text": " Michael Morris\n\nScript: Terrence O'Brien\n\nScript Editor: Dana Wollman\n\nCamera: Taylor Ligay\n\nEditor: Willis Lai & Chris Schodt\n\nProducer: Michael Morris Average land and sea-surface temperatures have risen about 1.1 degrees over the past century for Australia, and at a similar pace globally. For Australia, most of the warming has come since the 1950s. \"That means odds favour warmer-than-average temperatures more often than in the past,\" Dr Braganza said. \"When conditions are favourable, such as when you have an El Nino [in the Pacific] or when rainfall is lower over the Australian continent, we won't just get warmer than average conditions, we'll start to push into one of the warmest years on record, or [get] record-breaking events.\" Last year's warmth came even without a kick from an El Nino.\n\nSome of that potential for heat spikes was on show over the past weekend. Much of south-eastern Australia soared into the 40s, including 47.3 degrees in western Sydney, making it the second-hottest day in the Sydney Basin on record. Maximum records Last year was especially warm for Australia's daytime temperatures, with average maximums hotter than any year but 2013, coming in at 1.27 degrees above the 1961-90 baseline. Brisbane clocked up its hottest year on record for maximums, while Canberra and Hobart both had one of their top-three warmest years.\n\nFor Sydney, it was the second warmest year on record for mean and maximum temperatures, and included its hottest single month, last January. Richmond, to Sydney's north-west, set a record of 15 days above 40 degrees, while Observatory Hill's 12 days above 35 degrees equalled a record set in 1926,"} +{"output_text": " including the ability to create and publish reports in HTML5.\n\nSQL Server 2016 will also be the first release to support the new SQL Server 2016 Standard Edition, which will be available in both the traditional and cloud-based editions. The new Standard Edition will be a free upgrade for all SQL Server 2016 customers, and it will be available in the cloud-based edition as well.\n\nSQL Server 2016 will also be the first release to support the new SQL Server 2016 Standard Edition, which will", "input_text": " a pipedream. In addition, SSDT-BI was not part of the SQL Server installation process and had to be downloaded and installed separately.\n\nSQL Server 2014 added support for updateable clustered columnstore indexes, which let users modify and bulk-load data. Nonclustered columnstore indexes were still supported, but they also still could not be updated.\n\nIt should also be noted that around this time Microsoft made significant headway with HDInsight and PolyBase, ceding to the ever-mounting hype around Big Data. Although HDInsight, an Apache Hadoop solution, was already making waves during the SQL Server 2012 era, Microsoft seemed to be pushing harder than ever as SQL Server 2014 drew near, culminating in the release of PolyBase, a T-SQL front-end tool for querying data stored in a Hadoop/HDFS cluster and other data stores. At the same time, Microsoft continued to pull Azure into the picture by providing better integration across all systems.\n\n2016 \u00e2 SQL2016 \u00e2 SQL Server 2016\n\nAnd that brings us to SQL Server 2016, with its rumored codename of \u201cSQL2016.\u201d Microsoft apparently has a number of new and improved BI-related features in mind for the next release. As starters, SSDT and BIDS will evidently become more unified within its Visual Studio shell, making it backward compatible with previous versions and raising hopes that Microsoft will finally get SSDT right. And we might finally be able to script to SSAS tabular data from within the SSDT environment.\n\nFrom all appearances, Microsoft is planning to enhance all the major services. SSIS, for example, will introduce new AlwaysOn capabilities and incremental package deployment. MDS will get a security and performance facelift. SSAS will see new DAX functions and parallel processing capabilities. Even SSRS will be receiving a few upgrades,"} +{"output_text": " 1 A\u00e7\u0131lan Sand\u0131k Oran\u0131 : % 50 Kullan\u0131lan Oy : 1.973 Ge\u00e7erli Oy : 1.973 Kat\u0131l\u0131m Oran\u0131 : % 50\n\nOy Oran\u0131 Oy Say\u0131s\u0131 CUMHUR \u0130TT\u0130FAKI : % 50,00 1 AK PART\u0130 : % 49,99 1 MHP : % 0,00 0 M\u0130LLET \ufffd", "input_text": "m Oran\u0131 : % 40,65\n\nOy Oran\u0131 Oy Say\u0131s\u0131 CUMHUR \u0130TT\u0130FAKI : % 27,09 240 AK PART\u0130 : % 24,15 214 MHP : % 2,82 25 M\u0130LLET \u0130TT\u0130FAKI : % 36,68 325 CHP : % 30,36 269 \u0130Y\u0130 PART\u0130 : % 5,98 53 SAADET PART\u0130S\u0130 : % 0,23 2 H\u00dcDA PAR : % 0,34 3 VATAN PART\u0130S\u0130 : % 0,11 1 HDP : % 35,78 317\n\nJAPONYA\n\nToplam Sand\u0131k : 4 Toplam Se\u00e7men : 3.863 A\u00e7\u0131lan Sand\u0131k : 1 A\u00e7\u0131lan Sand\u0131k Oran\u0131 : % 25 Kullan\u0131lan Oy : 304 Ge\u00e7erli Oy : 303 Kat\u0131l\u0131m Oran\u0131 : % 31,5\n\nOy Oran\u0131 Oy Say\u0131s\u0131 CUMHUR \u0130TT\u0130FAKI : % 38,94 118 AK PART\u0130 : % 34,32 104 MHP : % 3,96 12 M\u0130LLET \u0130TT\u0130FAKI : % 20,79 63 CHP : % 14,85 45 \u0130Y\u0130 PART\u0130 : % 4,62 14 SAADET PART\u0130S\u0130 : % 1,32 4 H\u00dcDA PAR : % 0 0 VATAN PART\u0130S\u0130 : % 0,33 1 HDP : % 39,93 121\n\nKANADA\n\nToplam Sand\u0131k : 39 Toplam Se\u00e7men : 28.095 A\u00e7\u0131lan Sand\u0131k :"} +{"output_text": " of the Chinese Communist Party (CCP) launched a nationwide crackdown on the underground church. The campaign was launched in the name of \u201ccombating religious extremism\u201d and \u201cstrengthening the rule of law\u201d.\n\nThe campaign was launched in the name of \u201ccombating religious extremism\u201d and \u201cstrengthening the rule of law\u201d\n\nThe campaign was launched in the name of \u201ccombating religious extremism\u201d and \u201cstrengthening the rule of law\u201d\n\n", "input_text": " platforms have smaller audiences, so finding an acceptable deal can become challenging. The same issue impacts liquidity \u2014 if it is harder to sell cryptocurrency, it is harder to liquidate it into cash. Accounts on DCE platforms are uninsured. Most importantly, the usage of decentralized crypto exchange platforms is not for a complete beginner (DCE are not as easy to use as CCE).\n\nExamples of decentralized exchanges:\n\nBitsquare NVO (Still in development) BlackHalo Coinffeine Blocknet\n\nDecentralized exchanges are still a work in progress. Probably, now they are not as developed and massive as the centralized exchanges, It can be even said that now they are slightly overlooked, but, possible, in the nearest future there will be a tendency to switch to more decentralized models. After all, the whole world of cryptocurrency kind of carries the idea of decentralizing as such.\n\nTo conclude this article it is necessary to say that the choice of CCE or DCE almost completely relies on the goals the user wants to achieve. As it was listed in pros and cons section, liquidity can be achieved more successfully when using centralized crypto exchanges. In contrary, if the goal of the taken user is anonymity, putting decentralized exchanges in favour is obvious. The advice is \u2014 understand your goals, evaluate your knowledge and choose wisely between the two options!\n\nMeanwhile, if you don\u2019t want to deal with the hassle of trading on exchanges, you can always use the help of instant exchange services. ChangeNOW is one of them and we are glad to be the place where you can swap more than 150 cryptos \u2014 easy as cake! Persecution is escalating in this eastern province: local officials were preparing for higher-ups\u2019 visits by cracking down on churches and investigating believers.\n\nLi Mingxuan\n\nIn September last year, the United Front Work Department"} +{"output_text": " else would a school district in California allow a kindergarten class to dress up as \"drag queens\" and \"drag kings\" for a \"Drag Queen Story Hour\" event?\n\nThe answer is that we are being conditioned to accept this deviancy as normal.\n\nThe \"Drag Queen Story Hour\" event in Portland was sponsored by the Multnomah County Library System. The library's website says the event is \"designed to provide a safe, welcoming, and inclusive space for all", "input_text": " reject division, hold fast to our bold legacy and face this controversy. This is as an opportunity to engage in conversation about how we work together to resolve our differences constructively and continue to move our community forward. I would be pleased to facilitate and participate in any such conversation, now and in the future.\u201d\n\nCBS 42 reached out to the BCRI for further comment about the decision. We were told their offfices were closed on Sunday.\n\nThe Alabama NAACP says they\u2019ll have a statement Monday.\n\nCommunity activist Carlos Chaverst is also planning a press conference to address the decision Monday. ANALYSIS\n\nAre the days of \"live and let live\" over for those of us who believe in traditional, even biblical, values and want to pass that culture on to our children and grandchildren? As much as we may try to ignore it, today's culture is inundated with lust, sexual deviancy, and gender confusion, and - without trying to shock or exaggerate - it's coming for those precious children and grandkids we pray for and try so hard to protect.\n\nJust last week radio host, Todd Starnes, reported that children were \"cavorting\" on the floor with a man dressed garishly as a woman during a \"Drag Queen Story Hour\" in a Portland, Oregon public library. A photo showed a little girl lying directly on top of the drag queen. The photo has since been removed from the library's website.\n\nKey to bringing our kids into this brave new world of accepted deviancy is to keep us silent...whether by overwhelming us with shock and disbelief, or intimidating and shaming us into accepting this \"enlightened\" view of humanity.\n\nWhy else would sensible, loving parents think three-year-olds should be able to choose their gender unless they believed it harmless? Why"} +{"output_text": " will not pay for the wall.\n\nHe promised to repeal and replace Obamacare. The Republican-controlled Congress has not been able to do that.\n\nHe promised to repeal and replace the Affordable Care Act. The Republican-controlled Congress has not been able to do that.\n\nHe promised to repeal and replace the Iran nuclear deal. The Republican-controlled Congress has not been able to do that.\n\nHe promised to repeal and replace the Dodd-Frank Wall Street Reform and Consumer", "input_text": " jobs after being accused of sexual misconduct.\n\nThe movement is not over and neither are the accusations. Although most of the women who accused Trump spoke out but did not take action. Summer Zervos filed a defamation suit that is still making its way through the courts. Alva Johnson, a former Trump campaign staffer, has filed a lawsuit claiming that Trump kissed her without her consent during the 2016 presidential campaign.\n\nThese two lawsuits might cause added pressure and negative news coverage for the president, and could prompt additional women to take legal action against him. This could worsen the gender gap in his approval ratings, costing him the support of more female voters next year.\n\nAccording to an ABC News poll earlier this year, while 49 percent of men approve of the president\u2019s job performance, only 27 percent of women do.\n\nPolls\n\nAnd speaking of polls, they show the president\u2019s overall job approval ratings remain lower than his disapproval ratings. In rounded numbers, the latest RealClear Politics average of polls put his approval rating at 44 percent and his disapproval rating at 52 percent.\n\nThe midterm elections seemed to be a referendum on the president. Democrats gained 40 seats in the House and took over the majority in that chamber.\n\nSince he is up for re-election next year, the 2020 election is far more likely to be a referendum on the president. Polls show Trump trailing former Vice President Joe Biden by 13 points and Sen. Bernie Sanders of Vermont by 11 points in head-to-head matchups in a presidential race.\n\nBroken campaign promises\n\nThe president has a long list of broken campaign promises. There\u2019s not enough space to list them all here.\n\nHe told voters he would build a wall along our southern border and said Mexico would pay for it. Mexican leaders have made clear that their nation"} +{"output_text": "s current movement speed reduction is too strong for a class that is intended to be played as a close-quarters combatant. We are reducing the effectiveness of Suppressive Tools to bring the class more in line with the other ranged combatants.\n\nTactician\n\nDesigner Note: Tactician\u2019s current movement speed reduction is too strong for a class that is intended to be played as a close-quarters combatant. We are reducing the effectiveness of Suppressive Tools to bring the", "input_text": " by Entangling Tools.\n\nGuard Cannon now heals you for 3% (down from 5%) of your total health when damaging a target with Shoulder Cannon. Designer Note: Vanguard survivability currently surpasses the survivability of similar classes. We are reducing the effectiveness of Guard Cannon to bring the class more in line with the other tank classes.\n\nCommando\n\nMass Accelerator has returned in a new form: Increases the range of Explosive Round and High Impact Bolt by 20 meters. Acquired as a level 10 passive ability. Designer Note: Mass Accelerator has returned, granting Commandos a range boost alongside the new Explosive Round and High Impact Bolt maximum range changes to maintain their current range on Live. This additional passive preserves the Commando\u2019s design as a ranged specialist on the battlefield.\n\nBounty Hunter\n\nMissile Blast now has a base range of 10 meters (down from 30 meters).\n\nRail Shot now has a base range of 10 meters (down from 30 meters).\n\nPowertech\n\nDesigner Note: Powertech ranged potential is currently too strong for a class that is intended to be played as a close-quarters combatant. Powertechs are melee onslaught specialists and should be encouraged to engage their targets in close proximity. By reducing the range of Missile Blast and Rail Shot, we place the Powertech\u2019s range potential closer to its design as a close-quarters combatant and further define their unique style in various combat scenarios. Mercenaries maintain their 30-meter range through our reintroduction of \u201cPropulsion Systems.\u201d\n\nSuppressive Tools now reduces the movement speed of targets affected by Magnetic Blast, Flame Burst, and Flame Sweep for 3 seconds (down from 6 seconds). Designer Note: The Powertech\u2019"} +{"output_text": " x 16 players total). The top 16 players from the \"World Final\" event will be crowned \"World Final Champion\" at the FIA Prize-Giving Gala in December. The \"World Final\" event will be held at the same time as the \"Nations Cup World Final\" event. The \"World Final\" event will be held at the same time as the \"Nations Cup World Final\" event. The \"World Final\" event will be held at the same time as the", "input_text": " top 30 players from the Online Final Season's leaderboard will be given an opportunity to advance to the \"Nations Cup Regional Finals\". However, there will be a limit to the number of players from a single country (or locale), as shown below: EMEA Region: Up to 3 players from the same country (or locale) Asia/Oceania Region: Up to 10 players from the same country (or locale) Americas Region: Up to 10 players from the same country (or locale) (If by the end of a Season there are still free spots due to players withdrawing or not qualifying, the next players in the ranking may be invited. When we invite these additional players as replacement, the above limitations related to participants from the same country or locale will not apply. If two or more players have equal amount of points at the end of a Season, the ranking will be determined following these criteria: a) the highest Driver Rating; b) the highest Sportsmanship Rating; c) the earliest \"Gran Turismo Sport\" have been played, according to the PSN Online ID's record.) The selection of the backup participant will not be affected by the above limitations. The top 10 players of the \"Nations Cup Regional Finals\" can advance to the next stage, for a total of 30 players at the \"World Final\" event (3 Regions x 10 players.) The winner of the \"Nations Cup World Final\" will be crowned \"Nations Cup Champion\" at the FIA Prize-Giving Gala in December. Manufacturer Series Structure For the Manufacturer Series, the results of the Online Final Season will produce a global manufacturer ranking, from which the top 16 manufacturers will be selected to advance to the \"World Final\". The top players in each Region for each of these 16 manufacturers can participate in the \"World Final\" event (3 players total from 3 regions"} +{"output_text": " speech in New York City, calling for \u0093the immediate admission of the colored race to the right of citizenship\u0094.\n\nJune 14, 1870 Republican U.S. Sen. Benjamin Wade of Ohio introduces bill to grant African-Americans the right to vote.\n\nJuly 4, 1870 Republican U.S. Sen. Charles Sumner of Massachusetts delivers speech in which he denounces the \u0093Negro\u0092s right to vote\u0094.\n\nJuly 5, 1870 Republican U.S. Sen", "input_text": "This is a country for white men, and by God, as long as I am President, it shall be a government of white men\u0094.\n\nMay 20, 1868 Republican National Convention marks debut of African-American politicians on national stage; two \u0096 Pinckney Pinchback and James Harris \u0096 attend as delegates, and several serve as presidential electors.\n\nSeptember 3, 1868 25 African-Americans in Georgia legislature, all Republicans, expelled by Democrat majority; later reinstated by Republican Congress.\n\nSeptember 12, 1868 Civil rights activist Tunis Campbell and all other African-Americans in Georgia Senate, every one a Republican, expelled by Democrat majority; would later be reinstated by Republican Congress.\n\nSeptember 28, 1868 Democrats in Opelousas, Louisiana murder nearly 300 African-Americans who tried to prevent an assault against a Republican newspaper editor.\n\nOctober 7, 1868 Republicans denounce Democratic Party\u0092s national campaign theme: \u0093This is a white man\u0092s country: Let white men rule\u0094.\n\nOctober 22, 1868 While campaigning for re-election, Republican U.S. Rep. James Hinds (R-AR) is assassinated by Democrat terrorists who organized as the Ku Klux Klan.\n\nNovember 3, 1868 Republican Ulysses Grant defeats Democrat Horatio Seymour in presidential election; Seymour had denounced Emancipation Proclamation.\n\nDecember 10, 1869 Republican Gov. John Campbell of Wyoming Territory signs FIRST-in-nation law granting women right to vote and to hold public office.\n\nFebruary 3, 1870 After passing House with 98% Republican support and 97% Democrat opposition, Republicans\u0092 15th Amendment is ratified, granting vote to all Americans regardless of race.\n\nMay 19, 1870 African-American John Langston, law professor and future Republican Congressman from Virginia, delivers influential"} +{"output_text": " going to let her get away with it. \u201cI\u2019m not going to be a communist. I\u2019m not going to be a communist. I\u2019m not going to be a communist. I\u2019m not going to be a communist. I\u2019m not going to be a communist. I\u2019m not going to be a communist. I\u2019m not going to be a communist. I\u2019m not going to be a communist. I\u2019m not going to be a communist.", "input_text": " gum, which was supposed to enhance satiety and help people like Shirelle stick to the diets they had no need to be on in the first place. I didn\u2019t like to chew it, but my stomach was rumbling. I unwrapped a stick and chewed it. It tasted like caramelized heme proteins, which is to say, cooked blood\u2014in a good way, like a burger\u2014thanks to the transgenic yeast that it was cultured with. My stomach stopped making noise. Maybe it worked (and maybe it was the placebo effect).\n\n\u201cYou sure about that up-and-out?\u201d I tried not to sound too interested. Shirelle had a severe case of risk-aversion.\n\n\u201cGirl.\u201d Her side-eye could cut at a thousand yards. But I had been immune to it since ninth grade.\n\n\u201cCome on, Shirelle. Just asking. It\u2019s a daydream.\u201d\n\nCommunist parties were one of my favorite daydreams to dream: me and my revolutionary comrades in our funny Karl Marx beards, liberating a whole factory under the noses of the cops and the town, running all those machines and giving away free shit until the feedstock ran out. My dream parties didn\u2019t usually take place in a sheet-metal factory\u2014I liked the idea of taking over a scop factory where they made burgers or candy or ice cream because then I would be the person who gave everyone free candy (or burgers! or ice cream!)\u2014but I\u2019d take sheet metal if it was the only thing going. I could learn my skills there, and also Mama wouldn\u2019t kill me for the scop thing if she found out. Damned health-food crazies.\n\n\u201cLenae.\u201d She sounded like her own mama when she warned, but I wasn\u2019t"} +{"output_text": "HD15) was calculated using the Dutch Food Composition Table (NEVO) ( Reference Van der Schouw 24 ).\n\nThe MORGEN study is a prospective cohort study that started in 1993 and was designed to investigate the association between diet and the risk of chronic diseases. The study population consisted of men and women aged between 45 and 75 years who were living in the city of Utrecht, the Netherlands. The study design and methods have been described in detail elsewhere ( Reference Van der", "input_text": " consumption) or pure fruit juice consumption (for associations with fruit consumption). The third and fourth models were additionally adjusted for possible intermediate factors (energy intake, BMI, waist circumference) and important CVD risk factors (systolic blood pressure, TC). In the third model, energy intake and in the fourth model BMI, waist circumference, systolic blood pressure and TC were added as covariates.\n\nData were analysed using SAS 9.4 software (SAS Institute Inc.). Descriptive statistics were used to describe the characteristics of the study population. Cox proportional hazards models were used to estimate the hazard ratios (HR) and 95 % CI for the association of pure fruit juice consumption and fruit consumption with incident CVD, CHD and stroke. Pooled HR were estimated using stratified Cox models, assuming different baseline hazards for the two cohorts. The proportional hazard assumption was fulfilled according to Schoenfeld residuals.\n\nEducational level was defined as low (primary education, lower vocational education, advanced elementary education), intermediate (intermediate vocational education, completion of first 3 years of higher general secondary education) and high (completed higher general secondary education, higher vocational education and university). Cigarette, cigar or pipe smoking was classified as current, former or never. Physical activity was assessed using the validated ( Reference Pols, Peeters and Ocke 21 ) EPIC physical activity questionnaire and classified according to the Cambridge physical activity index into (moderately) active and (moderately) inactive ( Reference Wareham, Jakes and Rennie 22 ). Physical activity was not assessed with the EPIC questionnaire in the first year (1993) of the MORGEN study. Therefore, for 14 % of the EPIC-NL cohort, missing values were imputed using single imputation (SPSS Missing Value Analysis procedure) ( Reference Joosten, Grobbee and van der 23 ). The Dutch Healthy Diet index 2015 (D"} +{"output_text": " boxes of bones are stacked in the meadow, awaiting the artist\u2019s attention.\n\nAD\n\nAD\n\n\u201cI\u2019m not sure what I\u2019m going to do with them,\u201d Brewer said. \u201cI\u2019m not sure what I\u2019m going to do with them.\u201d\n\nThe bones are from the Holocaust, the genocide of six million Jews during World War II. The artist, who is Jewish, has been collecting them since she was a child.\n\n", "input_text": " company to Silver City. \u201cI kind of broke up with fine art 10 years ago,\u201d said Durrie, 38, who runs Power & Light Press, which uses vintage presses. Her company went viral in early 2017 with a simple canvas tote bag in support of Planned Parenthood. Sales exploded, and proceeds in excess of $90,000 have gone to the organization, she said.\n\nAD\n\nAD\n\nFor our anniversary getaway, we booked a room at Bear Mountain Lodge, a 1928 Pueblo-style guesthouse at the edge of the Gila National Forest that was built as a school for delinquents and is now owned by a group of artists. A quick drive from downtown, nestled in high desert dotted with pinyon, juniper and sage, the lodge offers 11 rooms and breakfast. During our four-day stay, we fed carrots to the resident horses, tracked the moon above our balcony and woke to the yipping of coyotes.\n\nOn our first morning, up before dawn, we hiked for two miles on Sunrise Ridge, one of three trails on the lodge property. Afterward, we enjoyed a bountiful breakfast: yogurt, homemade granola, bacon and French toast. We lingered over coffee, watching hummingbirds flit among several feeders on the sunny portico.\n\nLinda Brewer and her partner, John Rohovec, are Bear Mountain\u2019s majority owners. A potter who also owns a gallery in town, Brewer has filled the lodge with art: in the rooms and lobby, on the porticos and throughout the grounds \u2014 even the trails. The newest outdoor art here is One Million Bones, a powerful statement about genocide that was displayed on the Mall in 2013 and now has a permanent home in a meadow on the lodge property. It\u2019s a work in progress: boxes upon"} +{"output_text": " major quake).\n\nBut the Pacific Plate is not the only plate that grinds against North America. The North American Plate is also grinding against the Pacific Plate, and it\u2019s not going to stop anytime soon. The North American Plate is the largest of the three plates, and it\u2019s the one that\u2019s moving the fastest. It\u2019s also the one that\u2019s moving the most slowly. The North American Plate is moving at a rate of about 1 millimeter per year", "input_text": " is exposed most spectacularly as dramatic lava cliffs hanging over Lake Superior in Minnesota and Michigan. But this rift system is quiet. On the other side of New Madrid, the eastern seaboard is positively riddled with old faults\u2014a shattered underworld of scars from continental collisions that once thrust the Appalachians possibly as high as the Himalayas, and wrenching schisms that would later pull the supercontinent Pangaea apart. As this successor to Rodinia split, the Atlantic Ocean was born 200 million years ago. But before it became a proper ocean, it was a network of narrow rift valley lakes\u2014not unlike today\u2019s East African Rift Valley\u2014and it invited prehistoric crocodiles and dinosaurs to its shores. But the division also left behind a bedrock riven with the geological equivalent of stretch marks. Rumbles, like the one that cracked the Washington Monument in 2011, remind us that ancient scars exist unseen all along the east coast, but for the most part, these old faults are quiet. To get earthquakes you need not only faults, but strain.\n\nIn California there\u2019s plenty of strain. Where the Pacific Plate grinds inexorably against the edge of North America, earthquakes are easy to understand. As the plates try to move past each other, they catch, and strain builds up. When this strain reaches a breaking point, the land snaps back into place. These are earthquakes. And these earthquakes will continue with utter inevitability for the foreseeable future as the two plates continue to plow past each other, guided by the incandescent churn of the mantle far below. Along the San Andreas, the ground is warping at a rate of 40 millimeters per year. The longer California goes without earthquakes, the more strain this motion builds up (as a result, Los Angeles is long overdue for a"} +{"output_text": " the cult of personality.\n\nSo, what is fascism? It is a form of nationalism that is toxic, that is perverse, that is irrational, that is irrationalist, that is irrationalist and perverse, that is irrationalist and perverse, that is irrationalist and perverse, that is irrationalist and perverse, that is irrationalist and perverse, that is irrationalist and perverse, that is irrationalist and perverse, that is irrationalist and per", "input_text": " all other areas of life.\n\nAnd what about that toxic nationalism that Gopnik things is fascism? It\u2019s true that nationalism was the gloss that Hitler brought to Naziism. If he hadn\u2019t been vested with complete power, though, the nationalism would have been as innocuous as chants of \u201cUSA! USA!\u201d at Olympic games or the flag waving on Independence Day.\n\nHowever, becauseconcentrated power is toxic power, Hitler took that nationalism, mixed it with perverse race theology, and decided to rule the world as a master race \u2014 and to purge or enslave everyone else. The genocidal nationalism wasn\u2019t the cause of Hitler\u2019s mania; it was the toxic result of concentrating complete political power in the hands of a small group that went mad with that power.\n\nSo how about we go back now and look again at those two inane paragraphs that Adam Gopnik offered to justify his claim that Trump and his supporters are fascists?\n\nAs I have written before, to call him a fascist of some variety is simply to use a historical label that fits. The arguments about whether he meets every point in some static fascism matrix show a misunderstanding of what that ideology involves. It is the essence of fascism to have no single fixed form\u2014an attenuated form of nationalism in its basic nature, it naturally takes on the colors and practices of each nation it infects. In Italy, it is bombastic and neoclassical in form; in Spain, Catholic and religious; in Germany, violent and romantic. It took forms still crazier and more feverishly sinister, if one can imagine, in Romania, whereas under Oswald Mosley, in England, its manner was predictably paternalistic and aristocratic. It is no surprise that the American face of fascism would take on the forms of celebrity television and"} +{"output_text": ",5 \u043c\u043b\u0440\u0434 \u0440\u0443\u0431\u043b\u0435\u0439 \u2014 \u044d\u0442\u043e \u043c\u0430\u043b\u043e, \u043d\u043e \u044d\u0442\u043e \u043d\u0435 \u043f\u043e\u043c\u043e\u0436\u0435\u0442. \u042d\u0442\u043e \u043d\u0435 \u043f\u043e\u043c\u043e\u0436\u0435\u0442 \u043d\u0438 \u043e\u0434\u043d\u043e\u043c\u0443 \u0440\u043e\u0441\u0441\u0438\u0439\u0441\u043a\u043e\u043c\u0443 \u0431\u0438\u0437\u043d\u0435\u0441\u0443, \u043f\u043e\u0442\u043e\u043c\u0443 \u0447\u0442\u043e \u044d\u0442\u043e \u043d\u0435 \u043f\u043e\u043c\u043e\u0436\u0435\u0442 \u043d\u0438 \u043e\u0434\u043d\u043e\u043c\u0443 \u0440\u043e\u0441\u0441\u0438\u0439\u0441\u043a\u043e\u043c\u0443 \u0440\u0443\u043a\u043e\u0432\u043e\u0434\u0441\u0442\u0432\u0443. \u042d\u0442\u043e \u043f\u043e\u043c\u043e\u0436\u0435\u0442 \u0442\u043e\u043b\u044c\u043a\u043e \u0442\u0435\u043c, \u043a\u0442\u043e \u043f\u043b\u0430\u0442\u0438\u0442 \u0437\u0430 \u044d\u0442\u043e.\n\n\u2014 \u0412\u044b", "input_text": "\u0430\u043c \u00ab\u0437\u0430\u043f\u0440\u0435\u0449\u0430\u0435\u0442\u0441\u044f\u00bb. \u0422\u0430\u043a \u043e\u043d \u0437\u0430\u043f\u0440\u0435\u0442\u0438\u0442 \u0432\u0441\u0435 \u0444\u0438\u043b\u044c\u043c\u044b \u043e \u0412\u0435\u043b\u0438\u043a\u043e\u0439 \u041e\u0442\u0435\u0447\u0435\u0441\u0442\u0432\u0435\u043d\u043d\u043e\u0439 \u0432\u043e\u0439\u043d\u0435 \u0438 \u043c\u043d\u043e\u0433\u043e \u0435\u0449\u0435 \u0447\u0435\u0433\u043e, \u2014 \u043e\u043f\u0430\u0441\u0430\u0435\u0442\u0441\u044f \u043a\u0438\u043d\u0435\u043c\u0430\u0442\u043e\u0433\u0440\u0430\u0444\u0438\u0441\u0442.\n\n\u0413\u0435\u043d\u0434\u0438\u0440\u0435\u043a\u0442\u043e\u0440 \u0440\u0430\u0434\u0438\u043e \u00ab\u0428\u0430\u043d\u0441\u043e\u043d\u00bb \u0412\u043b\u0430\u0434\u0438\u043c\u0438\u0440 \u041c\u0430\u0441\u043b\u043e\u0432 \u0441\u0447\u0438\u0442\u0430\u0435\u0442, \u0447\u0442\u043e \u0437\u0430\u043f\u0440\u0435\u0442\u0430\u043c\u0438 \u043e\u0431\u0449\u0435\u0441\u0442\u0432\u0435\u043d\u043d\u0443\u044e \u043c\u043e\u0440\u0430\u043b\u044c \u043d\u0435 \u0443\u043b\u0443\u0447\u0448\u0438\u0442\u044c.\n\n\n\n\u2014 \u041d\u0430 \u043c\u043e\u0439 \u0432\u0437\u0433\u043b\u044f\u0434, \u0437\u0430\u043f\u0440\u0435\u0442\u0430\u043c\u0438 \u043c\u0430\u043b\u043e \u0447\u0442\u043e \u043c\u043e\u0436\u043d\u043e \u0434\u043e\u0441\u0442\u0438\u0447\u044c, \u043f\u043e\u0442\u043e\u043c\u0443 \u0447\u0442\u043e \u0440\u0430\u0437\u0443\u043c\u043d\u044b\u0435 \u0437\u0430\u043f\u0440\u0435\u0442\u044b \u043f\u043e\u043c\u043e\u0433\u0430\u044e\u0442 \u0432 \u043e\u0431\u0449\u0435\u043c \u0432\u043e\u0441\u043f\u0438\u0442\u0430\u043d\u0438\u0438 \u043e\u0431\u0449\u0435\u0441\u0442\u0432\u0430 \u0438\u043b\u0438 \u043f\u043e\u0434\u0440\u0430\u0441\u0442\u0430\u044e\u0449\u0435\u0433\u043e \u043f\u043e\u043a\u043e\u043b\u0435\u043d\u0438\u044f \u0432 \u043f\u0440\u0430\u0432\u0438\u043b\u044c\u043d\u043e\u043c \u0440\u0443\u0441\u043b\u0435. \u041d\u043e \u043d\u0438\u043a\u0430\u043a \u0438 \u043d\u0438\u043a\u043e\u0433\u0434\u0430 \u043e\u043d\u0438 \u043d\u0435 \u043c\u043e\u0433\u0443\u0442 \u0440\u0435\u0448\u0438\u0442\u044c \u043f\u0440\u043e\u0431\u043b\u0435\u043c\u0443. \u0423\u0436\u0435 \u0438\u0437\u0432\u0435\u0441\u0442\u043d\u043e \u0438\u0441\u0442\u043e\u0440\u0438\u0447\u0435\u0441\u043a\u0438, \u0447\u0442\u043e \u0437\u0430\u043f\u0440\u0435\u0442\u044b \u043f\u0440\u0438\u0432\u043e\u0434\u044f\u0442 \u043a \u0430\u0431\u0441\u043e\u043b\u044e\u0442\u043d\u043e \u043e\u0431\u0440\u0430\u0442\u043d\u043e\u0439 \u0440\u0435\u0430\u043a\u0446\u0438\u0438. \u0410 \u0447\u0442\u043e \u043a\u0430\u0441\u0430\u0435\u0442\u0441\u044f \u0441\u0440\u0435\u0434\u0441\u0442\u0432 \u043c\u0430\u0441\u0441\u043e\u0432\u043e\u0439 \u0438\u043d\u0444\u043e\u0440\u043c\u0430\u0446\u0438\u0438, \u0442\u043e \u043a\u0430\u0436\u0434\u044b\u0439 \u0440\u0430\u0431\u043e\u0442\u0430\u0435\u0442 \u0432 \u0442\u043e\u043c \u043d\u0430\u043f\u0440\u0430\u0432\u043b\u0435\u043d\u0438\u0438 \u0438 \u0441\u043e\u0437\u0434\u0430\u0435\u0442 \u0442\u043e\u0442 \u043f\u0440\u043e\u0434\u0443\u043a\u0442, \u043a\u043e\u0442\u043e\u0440\u044b\u0439 \u0442\u0430 \u0438\u043b\u0438 \u0438\u043d\u0430\u044f \u0430\u0443\u0434\u0438\u0442\u043e\u0440\u0438\u044f \u043f\u043e\u0442\u0440\u0435\u0431\u043b\u044f\u0435\u0442 \u2014 \u0432\u0435\u0434\u044c \u044d\u0442\u043e \u0436\u0435 \u0440\u044b\u043d\u043e\u043a, \u0438 \u0441\u0440\u0435\u0434\u0441\u0442\u0432\u0430 \u043c\u0430\u0441\u0441\u043e\u0432\u043e\u0439 \u0438\u043d\u0444\u043e\u0440\u043c\u0430\u0446\u0438\u0438 \u2014 \u044d\u0442\u043e \u0442\u043e\u0436\u0435 \u043a\u043e\u043c\u043c\u0435\u0440\u0446\u0438\u044f. \u0415\u0441\u043b\u0438 \u0435\u0441\u0442\u044c \u043d\u0435\u043a\u0430\u0447\u0435\u0441\u0442\u0432\u0435\u043d\u043d\u044b\u0439 \u043f\u0440\u043e\u0434\u0443\u043a\u0442 \u0438 \u0435\u0441\u0442\u044c \u043d\u0430 \u043d\u0435\u0433\u043e \u0441\u043f\u0440\u043e\u0441, \u0437\u043d\u0430\u0447\u0438\u0442, \u043f\u0440\u043e\u0431\u043b\u0435\u043c\u0430 \u0432 \u0434\u0440\u0443\u0433\u043e\u043c. 1"} +{"output_text": "It\u2019s a dog, not a snake,\u201d Josh had said, when she\u2019d first brought it home. \u201cIt\u2019s a dog, not a snake.\u201d) And they\u2019d be good with it, because they\u2019d be good with everything. They\u2019d be good with her. They\u2019d be good with Josh. They\u2019d be good with the world.\n\nShe\u2019d been so good with the world.\n\nShe\u2019d been so good with the", "input_text": " Denizens, such as a Fox with a distended stomach; such as a fey Robin that would improbably carry the Loaf away, speared on its beak, whenever it had succeeded in dropping a Clonking Rock on your Baker\u2014all of which Marie had learned over the summer by studying the Noble Baker manual while Josh was asleep.\n\nAnd it had helped, it really had. Josh was less withdrawn lately, and when she came up behind him now while he was playing and said, like, \u201cWow, honey, I didn\u2019t know you could do Pumpernickel,\u201d or \u201cSweetie, try Serrated Blade, it cuts quicker. Try it while doing Latch the Window,\u201d he would reach back with his non-controlling hand and swat at her affectionately, and yesterday they\u2019d shared a good laugh when he\u2019d accidentally knocked off her glasses.\n\nSo her mother could go right ahead and claim that she was spoiling the kids. These were not spoiled kids. These were well-loved kids. At least she\u2019d never left one of them standing in a blizzard for two hours after a junior-high dance. At least she\u2019d never drunkenly snapped at one of them, \u201cI hardly consider you college material.\u201d At least she\u2019d never locked one of them in a closet (a closet!) while entertaining a literal ditchdigger in the parlor.\n\nOh, God, what a beautiful world! The autumn colors, that glinting river, that lead-colored cloud pointing down like a rounded arrow at that half-remodelled McDonald\u2019s standing above I-90 like a castle.\n\nThis time would be different, she was sure of it. The kids would care for this pet themselves, since a puppy wasn\u2019t scaly and didn\u2019t bite. (\u201c"} +{"output_text": "ened and\n\nHe thought he was quite a bit leeter.\n\n\n\nThere once was a man named John\n\nWho was quite a bit leeter.\n\nHe was quite a bit leeter,\n\nHe was quite a bit leeter,\n\nHe was quite a bit leeter.\n\n\n\nThere once was a man named John\n\nWho was quite a bit leeter.\n\nHe was quite a bit leeter,\n\nHe was quite a bit le", "input_text": "To write all of his limericks reversed.\n\n\n\nint factorial(int sum) {\n\nif (sum == 1) return 1;\n\nif (sum!= 1)\n\nreturn product(sum,\n\nfactorial(sum - 1)); }\n\n\n\nThere once was a girl named Jude,\n\nWho's skirt by the wind was strewed.\n\nA man came along,\n\nAnd unless im quite wrong,\n\nYou expected this last line to be lewd.\n\n\n\nThere was a limerick I heard,\n\nWith stressed syllables quite awkward.\n\nRhythm was somewhat\n\nStill present in it, but\n\nIt forced mispronouncing every word.\n\n\n\nIf a pendulum's swinging quite free\n\nThen it's always a marvel to me\n\nThat each tick plus each tock\n\nOf the grandfather clock\n\nIs 2 pi root L over g.\n\n\n\nThere once was a man named Eugene\n\nwho invented a screwing machine.\n\nConcave and convex,\n\nit served either sex,\n\nand played with itself in between.\n\n\n\nThere once was a half-elven bard,\n\nWhose lute-ing was deemed avant-garde.\n\nHe considered it vogue,\n\nTo take levels in rogue,\n\nAnd he'd lute you while you were off guard.\n\n\n\nThere once was a man from O'Doole\n\nwho discovered red spots on his tool.\n\nHe went to the doc,\n\nwho looked at his cock,\n\nand said, \"wipe off the lipstick, you fool.\"\n\n\n\nThere once was a man named Vegeta\n\nWho thought he was quite a bit leeter.\n\nYet nine thou on the gauge\n\nleft him madd"} +{"output_text": " the Arabs.\n\nThe idea of a binational state is not new. It was first proposed by the Zionist movement in the late 19th century. The idea was rejected by the Arabs, who were not ready to give up their land.\n\nThe idea of a binational state was also rejected by the Jews who lived in Arab countries. The Jews who lived in Arab countries rejected the idea of a binational state because they were not ready to give up their land.\n\nThe", "input_text": " from embracing the values of democracy and tolerance and liberalism.\n\nAre the Palestinians not Muslim (other than perhaps 5 percent who are Christian)? Are the Hamas regime in Gaza and the Palestinian Authority in the West Bank liberal and tolerant and democratic? Are there grounds for believing that the Palestinians would conduct themselves differently than their Arab brethren elsewhere? In short, are the Palestinians comparable to Norwegians?\n\nA one-state solution with Jews and Arabs is a recipe for endless violence and anarchy that would ultimately lead to a country with an Arab majority \u2013 and a persecuted Jewish minority that would do anything to escape, as the members of Jewish communities in Arab countries did when their neighbors chased them out between 1948 and 1965.\n\nIn his article, Levy notes the hostility and hate that developed among the Arabs toward the Jews over the past 100 years (and to a lesser extent, hatred of Jews toward Arabs also developed).\n\nAnd rightfully so. From the Arabs\u2019 standpoint, we have stolen the country from them, trampled upon their dignity, jailed many and killed the parents of thousands of them. Would such things be forgotten when the Arabs, along with us, establish one state? Wouldn\u2019t it make more sense for the Arabs to use the new state to exact their revenge and retake the land and the homes that were \u201cstolen\u201d from them in 1948 and subsequently? There is also no doubt that the relative wealth of the Jews in the country would be tempting to the less well-off Arabs. Property crime would skyrocket.\n\nIn the 1920s, 1930s and 1940s, there was a handful of Jews who, through the groups Brit Shalom and Ihud, advocated for a binational state. The idea never took off. The vast majority of Jews rejected the idea, but it attracted even less support (almost no one, actually) among"} +{"output_text": " of your business practices. Second, we're a state that's been run by a bunch of rich white guys for a long time, and they've been able to get away with it because we're so far away from the rest of the country. Third, we're a state that's been run by a bunch of rich white guys for a long time, and they've been able to get away with it because we're so far away from the rest of the country.\n\nThe first two", "input_text": " to go off mere miles from Interstate 35. Aside from the near constant smell of cow shit, this fertilizer depot was not out of the ordinary, so the NIMBYs stayed at home.\n\nHere we see how the real power behind Texas Republicans perpetuates itself: through the unthinking, unknowing capitulation of the NIMBYs. Keep things quiet and we should be fine, is the mantra. Texas is the headquarters of corporate America's moral authority. It's the place where they can say, Yes, Everything's Working Fine, just like we told you. It's a place where an awkward, grandstanding suit like Perry can shoot off automatic rifles on camera, then lick his lips and declare economic dominance, even though his state's battalion of former used car salesmen barely survived the recession by getting jobs managing the local McDonalds. That's sadly what the state's economic \"recovery\" has amounted to: a boom in low-wage service jobs, very few of which offer even basic health care at prices workers can afford.\n\nThe problem with this corporate moral authority is that it's the same complete bullshit as the moral authority of globalism today: it places convenience and profits above survival, and makes undercutting your own working class to steal a few jobs away from a neighboring state seem like a net positive, not the shameful cannibalism it really represents. As we peel away the layers of this Banana Republic, it becomes clear that the only thing keeping it from spoiling entirely is blind faith.\n\nIn a nutshell, that's the problem with Texas: We're the biggest cannibal of them all right now, for a variety of reasons. First, if you're a corporation, we literally do not care what you're doing, and the apathy runs so deep that we'll abdicate even the most basic oversight"} +{"output_text": " in Paris on November 13, 2015, were a watershed moment for the French left. The Socialist Party, which had governed France for the previous 14 years, was defeated in the first round of the presidential election, and the far-right National Front (FN) won the second round. The Socialist Party\u2019s defeat was a result of a combination of factors, including the party\u2019s failure to address the economic crisis, its inability to address the issue of immigration, and its inability to address the issue", "input_text": " to it tends to have a hard time, and which seems to be getting ever more controversy-averse. Houellebecq himself has said that Whatever, the novel that made his name in France, could never be published today: \u201cour societies have come to a terminal stage where they refuse to recognize their malaise, where they demand that fiction be happy-go-lucky, escapist; they simply don\u2019t have the courage to face their own reality.\u201d So it may be that Houellebecq\u2019s career path could never be followed today, and he can be seen as a dinosaur, a relic, someone who slipped past the bouncer and achieved a degree of popular status at a different moment when it was more permissible for dangerous ideas to proliferate.\n\nMaybe we can get there again. In the meantime, reading Houellebecq, and thinking about what he represents, is a good first step. The formal elements of his literary stardom \u2013 the way he seems to absorb every controversy while continuing to write, continuing to gain popularity, never wilting under the gaze of the outrage machine \u2013 are inspiring, to be sure, providing a target for conservatives and cultural libertarians to aim for. But it\u2019s the content of his books that sets him apart, for by reading those books, we can gain a degree of internal freedom: the means to slip the mind-forged manacles endemic to our time and place and dream of something else. Houellebecq has found a way to look at contemporary sexual practices from the outside, as something to be questioned rather than taken for granted. By learning to question as he does, we, too, can free ourselves from the compulsion to treat the status quo as inevitable.\n\nShare this: Pocket\n\nWhatsApp\n\n\n\nEmail\n\nPrint\n\n The terrorist attacks"} +{"output_text": " was fined $26.5 million.\n\nBut the verdict has stunned activists, who had hoped that the original guilty verdict would set a strong precedent for the judicial fight against environmental crimes.\n\nThe verdict has effectively been derailed by a district court on the basis of a simple typo.\n\nThe court in the city of Medan, in the northern part of Sumatra, ruled on Tuesday that the company was not guilty of the crime because the court had used the wrong date to", "input_text": "BTB case (shown in Fig. 4c), no frequency fading is found for the focusing case (shown in Fig. 4d). This absence of frequency fading can be attributed to the limited illuminated area on the diffuse reflector, causing negligible multiple path delays. The shaped wavefront is projected to a spot with a diameter of ~1.5 mm. Consequently, the time delay between the shortest and longest path from the transmitter cannot exceed 10 ps. For the OFDM signal used in our experiment, its cyclic prefix length is 1.333 \u03bcs, which is much larger than the maximum time delay (10 ps). Therefore, inter-symbol interference is not a limiting factor in the proposed system. Detailed analysis of the link performance is presented in S3 in the Supplementary Information. A district court in Indonesia has shielded an oil palm company from a Supreme Court ruling ordering it to pay $26.5 million in fines for burning peatlands in a high-biodiversity area, citing a typo in the original prosecution.\n\nThe verdict has stunned activists, who had hoped that the original guilty verdict would set a strong precedent for the judicial fight against environmental crimes.\n\nThe government is appealing the latest ruling, which, ironically, is fraught with typos that \u2014 under the same legal logic \u2014 would render it just as invalid as the original guilty verdict.\n\nJAKARTA \u2014 A landmark Indonesian case that saw an oil palm company fined millions of dollars for burning carbon-rich peatlands has effectively been derailed by a district court on the basis of a simple typo.\n\nPT Kallista Alam was in 2015 found guilty by the Meulaboh District Court in Aceh province of using fire to clear 10 square kilometers (3.9 square miles) of land in the Tripa peat swamp on the northwest coast of Sumatra. The company"} +{"output_text": " deutlich erweitern. Der Spieler, der sich beim Fallen mit der Hand abzust\u00fctzt, sollte nicht mehr als eine Handvoll Zentimeter \u00fcber dem K\u00f6rper des Gegners stehen. Der Gegner, der sich beim Fallen mit der Hand abzust\u00fctzt, sollte nicht mehr als eine Handvoll Zentimeter \u00fcber dem K\u00f6rper des Gegners stehen.\n\nDie neuen Reg", "input_text": "chst: Wird ein Treffer mit Hand oder Arm erzielt, ist dieser prinzipiell irregul\u00e4r. Es reicht sogar schon, wenn der Spieler mit Hand oder Arm in Ballbesitz kommt und sich so \"einen klaren Vorteil verschafft\", etwa zu einer Torchance kommt - ganz egal, ob das Handspiel absichtlich erfolgte oder nicht.\n\nAbsichtliche Handspiele bleiben strafbar. Unabh\u00e4ngig von der Absicht liegt nun aber \"in der Regel\" auch dann ein Vergehen vor, wenn der Spieler seine K\u00f6rperfl\u00e4che unnat\u00fcrlich vergr\u00f6\u00dfert oder wenn sich der Arm \u00fcber der Schulter befindet und vom Ball touchiert wird. Das gilt auch, wenn der Ball aus kurzer Distanz kommt. Legitim ist es derweil, sich beim Fallen mit der Hand abzust\u00fctzen, hier soll entsprechend kein Handspiel (mehr) gepfiffen werden. Und auch wenn der Ball vom eigenen K\u00f6rper oder vom K\u00f6rper eines beliebigen anderen Spielers, der sich in der N\u00e4he befindet, an Hand oder Arm springt, sollen die Referees ab sofort nicht mehr pfeifen (wenn eben nicht auch die K\u00f6rperfl\u00e4che unnat\u00fcrlich vergr\u00f6\u00dfert oder der Arm \u00fcber der Schulter ist). Der Ballkontakt sei in diesen Situationen schlie\u00dflich \"oft unvermeidbar\", so das IFAB.\n\nDie neuen Formulierungen sollen den Ermessensspielraum des Referees"} +{"output_text": " email accounts to conduct official business, a violation of federal record-keeping laws.\n\nThe White House has said that the president\u2019s personal lawyer, Rudolph W. Giuliani, was using his personal email account to conduct official business. But the White House has not said whether Mr. Giuliani was using his personal account to conduct official business.\n\nThe White House has said that the president\u2019s personal lawyer, Rudolph W. Giuliani, was using his personal email account to", "input_text": "\u2019ve worked at all with fork(). After the fork, the parent and child have copies of all open file descriptors and data values, so the pointer works for both. pid, however, differs. The child gets 0, the parent gets the process ID of the child, and the value of the variable determines which of the if / then / else branches to take. The child writes some bytes to the pointer and then exits. The parent waits for the child to exit and then reads what was written.\n\n. After the fork, the parent and child have copies of all open file descriptors and data values, so the pointer works for both., however, differs. The child gets 0, the parent gets the process ID of the child, and the value of the variable determines which of the / / branches to take. The child writes some bytes to the pointer and then exits. The parent waits for the child to exit and then reads what was written. Before the parent can exit, however, it must free the shared memory. munmap() and shm_unlink() do the trick.\n\nThis example is very elementary. A real application would use semaphores or other techniques to control reading and writing to the shared segment. Such control is typically application specific, and you can find many examples in the Berkeley Software Distribution (BSD) and Linux source, if your UNIX flavor is not open source.\n\nAll for one\n\nBecause UNIX runs many applications seemingly at the same time, it\u2019s an ideal platform for monitoring, data collection, cooperative and distributed computing, and client-server applications. Shared memory is the fastest of the interprocess communications options available and is quite flexible. You can map files into memory, as well \u2014 an ideal solution for accelerating data access. On Tuesday, lawmakers said that President Trump\u2019s top envoys for Ukraine and the European Union used personal"} +{"output_text": "\n\n\"I have a lot of pain. I have a lot of anxiety. I have a lot of depression. I have a lot of sleep problems,\" he says.\n\n\"I have a lot of things that I need to be medicated for.\"\n\nJack says he\u2019s been growing pot for about 10 years and he\u2019s never had a problem with police.\n\n\"I\u2019ve never had a problem with the police. I\u2019ve never had a problem with", "input_text": " reach their racist goals. That was the message Israeli officials and I conveyed in our discreet talks with the administration. The response was not, to our mind, reasonable.\n\n\nMichael Oren, a member of the Israeli parliament, the Knesset, is the author of \u201cAlly: My Journal Across the American-Israeli Divide,\u201d to be published June 23.\n\nFollow the Opinion section on Twitter @latimesopinion and Facebook A B.C. pot grower says he and many others are making thousands of dollars every month growing licensed medical marijuana and selling it for illegal distribution on the streets, and there is little police can do to stop it.\n\nAs he descends the steps of his Nelson home, Jack throws a switch and lights up 85 well-tended pot plants. Exhaust fans hum as he tends to his crop.\n\n\"This is a three weeks in bud. You can see they got a nice bud on them,\" says Jack, who CBC agreed to identify only by his first name.\n\n\"It's starting to smell nice. Another four weeks and these will be done,\" he says.\n\nJack says he has been a pot grower his whole adult life and he\u2019s made a good living at it, comfortably raising a family.\n\nSome licensed medical marijuana growers are earning thousands of dollars each month on the illegal market. (Bob Keating/CBC)\n\nHe estimates he gets about $20,000 every two months from his basement operation. With six crops a year, that\u2019s about $120,000 a year.\n\nJack says he used to grow his pot illegally and sell it on the Prairies, but that was too risky, so he applied to Health Canada for a Personal Use Production licence.\n\nHe filled out a few forms and got a doctor to sign off on a medical condition."} +{"output_text": " past few decades. In the 1970s, the prevailing theory was that the universe was filled with a vast number of small, isolated structures. But in the 1990s, astronomers began to realize that the universe was far more complex than that.\n\nThe new study, published in the journal Nature, shows that the universe is filled with a vast number of large, interconnected structures. The largest of these structures is a supercluster of galaxies that spans a distance of about 2.5 billion light years", "input_text": ", which contains 8 overseas offices and 60+ engineers. We have multilingual teams from pre-sales consulting, through manufacturing, shipping, installation, commissioning, product updates, to after-sales tracking, etc. More than that, Beston has rich exporting experience to save your time and money.\n\nEco-friendly Design \u2013 Reduce Environmental Cost\n\nProtecting the environment has become a new trend, and each government pays more attention to the related project. Our waste pyrolysis plants are eco-friendly. We equip a spray de-dusting system to purify the waste produced during the pyrolysis process to meet the EU emission standard. Below are Beston pyrolysis plants in the UK, Romania, and Turkey.\n\nAll in all, the cost of pyrolysis plant is affected by many factors. Please remember to cooperate with a reputable manufacturer first. Then to find the one fits your needs the best. Pay more attention to the ROI instead of the pyrolysis machine price only. Of course, you should make clear what materials and how many are you going to process, your budget, etc. This will help you to get a suitable plant soon. If you feel confused, just contact Beston Group for help now! I believe you will enjoy partnering with us!\n\n Astronomers Discover Largest Structure in the Universe\n\nIt\u2019s ten billion light years across and almost as far away but nobody had spotted it\u2026until now\n\nWhat\u2019s the largest structure in the Universe? That\u2019s a question that has intrigued scientists for centuries. Today, they get an answer thanks to astronomers who say they\u2019ve discovered the largest structure ever observed and one that dwarfs the previous record-holder by billions of light years.\n\nAstronomer\u2019s ideas about the universe\u2019s largest structures have changed dramatically in the"} +{"output_text": ", who was a member of the Central Committee of the Bolshevik Party, wrote that \u201cthe dictatorship of the proletariat is the period of civil war.\u201d94\n\nTrotsky, who was a member of the Central Committee of the Bolshevik Party, wrote that \u201cthe dictatorship of the proletariat is the period of civil war.\u201d95\n\nTrotsky, who was a member of the Central Committee of the Bolshevik Party, wrote", "input_text": ". For example\u2026\n\nTrotsky habitually eschewed Lenin\u2019s pattern of anxiously inquiring into \u201cwhat Marx said,\u201d on any given question; and of course this attitude can be explained as a justified aversion to quotation-mongering. The fact is, however, that typically Trotsky not only didn\u2019t care \u201cwhat Marx said\u201d but often didn\u2019t know what Marx thought\u2014an ignorance which can also be vindicated, perhaps, provided he did not attempt to expound what he was ignorant of. In Terrorism and Communism Trotsky came a cropper at the first attempt. Referring to Locus 12 he wrote that Engels \u201cobstinately defended the dictatorship of the proletariat as the only possible form of its control of the state.\u201d91 This formulation would have been impossible to Engels, for whom the \u2018dictatorship of the proletariat\u2019 was not a form of the workers\u2019 state but a synonym for the workers\u2019 state.\n\nSimilarly: Kamenev plainly had no idea that the Russian party was the only one that had programmatically adopted the \u2018dictatorship of the proletariat.\u2019 \u201cThe dictatorship of the proletariat,\u201d he wrote, \u201cappears in the programs of the Socialist parties not later than the seventies of the nineteenth century.\u201d92 This real ignorance of the history of Marxism and the movement should be borne in mind when we come, below, to exegeses on the Paris Commune.\n\nMore crudely than Lenin, these leaders and theoreticians plainly equated the \u2018dictatorship of the proletariat\u2019 with the period of civil war. Kamenev flatly called the \u2018dictatorship\u2019 a \u201cperiod of warfare,\u201d \u201can epoch of undisguised warfare, and armed clash\u2026\u201d93 Bukharin"} +{"output_text": "nergy.\n\nInformation forwarded to bug-gnu-emacs gnu.org :\n\nbug#4008 ; Package emacs. (Sun, 06 Mar 2016 06:41:02 GMT) to; Package. (Sun, 06 Mar 2016 06:41:02 GMT) Full text and rfc822 format available.\n\nMessage #41 received at 4008 debbugs.gnu.org (full text, mbox):\n\nFrom", "input_text": "\n\nAcknowledgement sent to rms gnu.org :\n\nExtra info received and forwarded to list. Copy sent to Emacs Bugs gnu.org>. (Tue, 04 Aug 2009 05:35:04 GMT) toExtra info received and forwarded to list. Copy sent to. (Tue, 04 Aug 2009 05:35:04 GMT) Full text and rfc822 format available.\n\nbug reassigned from package 'emacs' to 'emacs,ns'. Request was from Lars Magne Ingebrigtsen gnus.org> to control debbugs.gnu.org. (Sat, 17 Sep 2011 07:05:02 GMT) Request was fromto. (Sat, 17 Sep 2011 07:05:02 GMT) Full text and rfc822 format available.\n\nInformation forwarded to bug-gnu-emacs gnu.org :\n\nbug#4008 ; Package emacs. (Sun, 06 Mar 2016 06:41:02 GMT) to; Package. (Sun, 06 Mar 2016 06:41:02 GMT) Full text and rfc822 format available.\n\nMessage #40 received at 4008 debbugs.gnu.org (full text, mbox):\n\nFrom: Hagmonk icloud.com> To: 4008 debbugs.gnu.org Subject: bug#4008: 23.1; missing menus in mac os x build when using synergy Date: Sat, 05 Mar 2016 20:12:47 -0800\n\nI propose this bug be returned to the reporter with a request to reproduce on a more recent version of Emacs and Sy"} +{"output_text": " most part, mental illness is not a natural kind. Instead, it is a complex, polygenic, and highly variable set of conditions that are influenced by a multitude of genetic and environmental factors.\n\nThe most recent edition of the DSM, DSM-5, has been criticized for its lack of attention to the genetic and environmental factors that influence mental illness. But the fact is that the DSM-5 is not the first edition of the DSM to be criticized for its lack of attention to genetic and", "input_text": " B. Guze, rejected the murky psychoanalytic diagnostic formulations of their time. Instead, they embraced a medical model inspired by the careful 19th-century observational work of Emil Kraepelin, long overlooked during the mid-20th-century dominance of Freudian theory. Mental disorders were now to be seen as distinct categories, much as different bacterial and viral infections produce characteristic diseases that can be seen as distinct \u201cnatural kinds.\u201d\n\nDisorders, Robins and Guze argued, should be defined based on phenomenology: clinical descriptions validated by long-term follow-up to demonstrate the stability of the diagnosis over time. With scientific progress, they expected fuller validation of mental disorders to derive from laboratory findings and studies of familial transmission.\n\nThis descriptive approach to psychiatric diagnosis -- based on lists of symptoms, their timing of onset, and the duration of illness -- undergirded the American Psychiatric Association\u2019s widely disseminated and highly influential Diagnostic and Statistical Manual of Mental Disorders, first published in 1980. Since then, the original \u201cDSM-III\u201d has yielded two relatively conservative revisions, and right now, the DSM-5 is under construction. Sadly, it is clear that the optimistic predictions of Robins and Guze have not been realized.\n\nFour decades after their seminal paper, there are still no widely validated laboratory tests for any common mental illness. Worse, an enormous number of family and genetic studies have not only failed to validate the major DSM disorders as natural kinds, but instead have suggested that they are more akin to chimaeras. Unfortunately for the multitudes stricken with mental illness, the brain has not given up its secrets easily.\n\nThat is not to say that we have made no progress. DNA research has begun to illuminate the complex genetics of mental illness. But what it tells us, I would argue, is that, at least for the"} +{"output_text": "\n\nThe attack on Saudi Arabia\u2019s oil facilities was a \u201cblatant violation of international law,\u201d the U.S. ambassador to the United Nations, Nikki Haley, said on Monday.\n\n\u201cThe United States is aware of the reports of an attack on Saudi Arabia,\u201d Haley said in a statement. \u201cWe are working with the Kingdom of Saudi Arabia and other partners to determine the facts of the attack, and we will continue to provide updates as they become available.\u201d", "input_text": " Foundation for Defense of Democracies (FDD) notes is like an arsonist offering to send a fire brigade. Mark Dubowitz, also with FDD, adds that prior U.S. administrations have tried restraint, but Iran has never positively responded to such gestures.\n\nThe strongest evidence that Iran and not its Yemeni client carried out the attack on Saudi Arabia\u2019s prime oil field comes from the attack\u2019s sophistication:\n\n[The north/northwesterly direction of the attack and the failure of missile defense batteries to intercept] suggest a low-level cruise missile attack that hugged the ground at altitudes of under 300 feet. The \u2026 arrival rate was very high, possibly even 95 percent \u2026 routes were carefully planned to avoid obstacles such as power lines and communication towers. Seventeen individual impact points were struck at the Abqaiq facility, with a smaller number (perhaps as low as two) at Khurais. The weapons were highly accurate \u2014 for instance, all twelve of the thirty-meter-wide spheroid gas-oil separation tanks at Abqaiq were hit almost dead center. Much thinner stabilization towers were also accurately struck. There are even indications of finesse in the strike\u2019s \u201cweaponeering,\u201d the technical term for munition selection and modification. Some \u201caim-points\u201d were clearly hit with large explosive payloads consistent with an Iranian cruise missile such as the 700-kilometer-range Ya-Ali. Yet the gas-oil separation tanks appear to have been struck with high-velocity kinetic force sans explosions, perhaps signaling an effort to damage but not permanently destroy them. Similar finesse was visible in Iran\u2019s May 12 attacks in the Fujairah anchorage off the United Arab Emirates, where four ships had their hulls expertly holed without causing the vessels to spill oil, sink, or suffer massive fires."} +{"output_text": " targeted by these individuals in the past,\" Grewal said. \"We have to be vigilant and we have to be prepared.\"\n\nThe FBI is assisting in the investigation, Grewal said.\n\n\"We're going to be working with the FBI to see if there's any other information that they can provide to us,\" he said.\n\nThe FBI is also working with the New Jersey State Police, the New Jersey Office of Homeland Security and Preparedness, the New", "input_text": " the store,\" Lax said.\n\nHe said the woman fired her weapon at him as he ran out of the store. Surveillance footage captured the moment the shooters opened fire as they walk inside the market. A man, who NBC News has identified as Lax, runs out the store and across the street. He said he was lucky because he ran in front of a car that seconds later pulled away from the store.\n\n\"I didn't look right or left, I just ran for my life,\" he said.\n\nAttack was an act of terrorism, officials say\n\nAuthorities are not yet sure why the shooters, David N. Anderson, 47, and Francine Graham, 50, attacked the store, New Jersey Attorney General Gurbir S. Grewal said. But it is being investigated as an act of terrorism with \"a hate-crime bias slant.\"\n\n\"We believe the suspects held views that reflected hatred of the Jewish people, as well as a hatred of law enforcement,\" Grewal said, citing evidence and witness interviews.\n\nJersey City Mayor Steven Fulop said Friday it was important to call out anti-Semitism.\n\n\"We've always had a diverse community and I think people need to realize that if anti-Semitism can exist in a place that's accustomed to diversity, it can really exist anywhere,\" Fulop told CNN's \"New Day.\"\n\n\"And every day and every moment that you don't call it out, you're wasting an opportunity to bring attention to it because there's less eyes focused on it.\"\n\nBoth shooters expressed an interest in the Black Hebrew Israelites movement, though neither appear to have formal links to the movement, Grewal said. Some members of the movement have expressed anti-Semitic sentiments in the past.\n\n\"Our community has been"} +{"output_text": " and has since been a regular in the first team. He has been a regular in the first team since the start of the season, and has been a regular in the first team since the start of the season, and has been a regular in the first team since the start of the season, and has been a regular in the first team since the start of the season, and has been a regular in the first team since the start of the season, and has been a regular in the first team", "input_text": "ting trolls; the absolute majority prefers feeding the parasites.\n\nThe \u2018big boys\u2019 of the IT industry are indirectly involved in the development of the patent trolling industry, and facilitate the extortion of smaller companies.\n\nTrolls continue to be really underhand and nasty, bombarding potential victims with nonsense actions \u2013 and without incurring liability in case of withdrawal of a claim.\n\nEarlier, patent showdowns were the lot of major vendors. Now, even small developers of mobile applications are involved in this vicious circle. The continuation of the trend is fraught with scandalous transfers of mega funds from innovators to social parasites, provoking the collapse of the IT industry.\n\nGovernments need to move to resolute, specific and systemic action against trolls, including things like on the list here.\n\nSo it\u2019s with all my heart that I congratulate and thank our team and our court counsels from Patterson Thuente IP for protecting our intellectual property! The victory has already had one curious consequence: it caused slight though wholly irreparable damage to the world\u2019s reserves of 18-year-old Chivas Regal :).\n\nUPDATE: prior art that can be used to invalidate Lodsys\u2019 patents. The free hit at APOEL last week felt like the perfect opportunity for Mauricio Pochettino to give Marcus Edwards his first start for Tottenham.\n\nIf you only watch first team football and pay no interest to our youth teams, this will likely be your only prior knowledge of our young attacking midfielder:\n\n#ICYMI yesterday, a glimpse into the future of Spurs with Marcus Edwards\u2026 #COYS pic.twitter.com/berCu4A3ix \u2014 New York Spurs (@NYSpurs) September 22, 2016\n\nEdwards made his debut in this match in September 2016,"} +{"output_text": " contest. It\u2019s the album that made Priest the metal band they are today.\n\nThe Metallian is a weekly column by writer and musician Dani Filth. He is the lead singer of the band Cradle Of Filth and the author of the book The Metallian.\n\nShare this: Twitter\n\nFacebook\n\nPinterest\n\nTumblr\n\n", "input_text": " to say he bowed out on a high would be an understatement. The title-track is perhaps the single finest pure metal song ever written, exploding in an orgiastic welter of speed metal riffs, furious rhythms and lung-bursting screams. Not every song here is faster than a laser bullet but they all consistently slay, from the sinister creep of A Touch Of Evil to the piledriving stomp of Between The Hammer & The Anvil.\n\n2. British Steel (1980)\n\nBy the turn of the \u201980s Judas Priest were already veterans but they gate-crashed the NWOBHM party like cool older cousins who could bring the booze. Fittingly, British Steel boasted the ultimate party song in Living After Midnight. It kicked off with the proto-thrash Rapid Fire but generally featured a more singalong, anthemic sound. Breaking The Law remains their most recognisable song while Metal Gods gave the band their epithet. Bonus fact: in a time before digital sampling was commonplace, they made the sound of marching metal feet by rattling trays of cutlery in the studio.\n\n1. Defenders Of The Faith (1984)\n\n\u201cRising from darkness where hell hath no mercy, and the screams of vengeance echo on forever, only those who keep the faith shall escape the wrath of The Metallian\u2026 Master of all Metal.\u201d The only thing that even comes close to being as metal as the legend on the back of this album is maybe the titanium plate implanted into Slayer frontman Tom Araya\u2019s neck from too much headbanging. And, of course, the contents of the album itself. Defenders Of The Faith is frequently overlooked in favour of its predecessor Screaming For Vengeance. In terms of breaking the band and overall impact, there\u2019s no"} +{"output_text": "ant donn\u00e9 que le code est cass\u00e9, et que les d\u00e9veloppeurs sont en retard, les d\u00e9veloppeurs ne peuvent pas faire de d\u00e9veloppement. Et c\u2019est l\u00e0 que le probl\u00e8me se pose : les d\u00e9veloppeurs ne peuvent pas faire de d\u00e9veloppement.\n\nLe probl\u00e8me est que les d\u00e9veloppeurs ne peuvent pas faire de d\u00e9veloppement.\n\nLe probl\u00e8me est que les d\u00e9veloppeurs", "input_text": "), car vous comprenez, les grands groupes font confiance \u00e0 cette technologie, elle permet d\u2019employer des juniors, il y a une grosse communaut\u00e9, etc. Le prestataire peut empiler stagiaires et d\u00e9butants sur le code, car c\u2019est l\u00e0 la promesse de l\u2019approche, leur mettre la pression, et apr\u00e8s pas mal de larmes, de retard, et de d\u00e9pit, une application dont personne n\u2019est fier part en production. Peu de temps apr\u00e8s, il faut tout de m\u00eame la faire \u00e9voluer. Quelle est l\u2019annonce qui va \u00eatre produite pour le recrutement? Cherche d\u00e9veloppeur wordpress/rails/django, niveau bac+5 exig\u00e9, aimant le challenge, connaissant au moins 10 blagues et \u00e9tant force de proposition sous la pression.\n\nEt c\u2019est parti, le probl\u00e8me est maintenant auto-r\u00e9pliquant : l\u2019entreprise ne va attirer que des personnes qui ne voient aucun souci avec ce qui a \u00e9t\u00e9 produit. Ils vont peut \u00eatre rire de comment \u00e7a a \u00e9t\u00e9 mal fait, oubliant par la m\u00eame la quantit\u00e9 de pression, et de d\u00e9cisions prises \u00e0 la h\u00e2te ayant men\u00e9es \u00e0 cette situation.\n\nDans un monde comme celui-l\u00e0, ce cycle para\u00eet normal. C\u2019est le jeu de l\u2019informatique : tout est toujours hors budget, en retard et bugu\u00e9. Du coup, autant ne pas payer cher quelque chose qui sera de toute mani\u00e8re cass\u00e9.\n\nPour expliquer le titre de cette section, que s\u2019est-il pass\u00e9 ici? \u00c9t"} +{"output_text": ". P. 188\n\nLinda was shocked. She had no idea that she was a victim of Dr. Cameron's depatterning. She had no idea that she was a victim of the CIA's MKULTRA program. She had no idea that she was a victim of the CIA's mind control experiments. P. 189\n\nLinda was not a victim of Dr. Cameron's depatterning. She was a victim of the CIA's mind control experiments. She was a victim", "input_text": ". 181-182\n\nDr. Cameron proved that doctors skilled in the right procedures can erase a subject's memory. His depattering technique resulted in permanent and complete amnesia. To this day, Linda MacDonald is unable to remember anything from her birth to 1963. As recorded by nurses in her chart, Linda was reduced to a vegetable state. She was completely disoriented. She didn't know her name, age or where she was. She didn't recognize her children. She couldn't read, drive, cook, or use a toilet. Not only did she not know her husband, she didn't even know what a husband was. P. 182-183\n\nThere is a connection to politics, power, and weapons in Linda MacDonald's life. Her husband worked for the Canadian Armament Research Development Establishment. His immediate boss was a man who sold arms to Saddam Hussein. His boss was also tied into the Iran-Contra affair, and was murdered in Europe a few years ago. P. 186\n\nLife changed for Linda when the Canadian Broadcasting Corporation program, The Fifth Estate, aired a segment on Dr. Cameron on January 17, 1984. A Vancouver newspaper ran a full-page story on Robert Loggie, a Vancouver man who had been experimented on by Dr. Cameron. Loggie was a plaintiff in the class action suit against the CIA for Dr. Cameron's MKULTRA experiments, which was settled out of court for $750,000, divided among the eight plaintiffs. P. 187\n\nLinda's mother phoned her about the program. Linda shook a lot in reaction to the news and didn't know what to do. Through a reporter she got in touch with a Washington lawyer representing the eight Canadian plaintiffs. He advised Linda that she could not be a party to the class action suit against the CIA because she was 'treated' by Dr. Cameron"} +{"output_text": "ifs de solidarit\u00e9.\n\nLes non-recours sont aussi une r\u00e9ponse \u00e0 la d\u00e9gradation des conditions de travail. Les travailleurs pauvres sont de plus en plus souvent confront\u00e9s \u00e0 des conditions de travail d\u00e9grad\u00e9es, \u00e0 des contrats de travail pr\u00e9caires, \u00e0 des salaires insuffisants, \u00e0 des horaires de travail insuffisants, \u00e0 des conditions de travail d\u00e9grad\u00e9es, \u00e0 des contrats de travail pr\u00e9c", "input_text": " les assiste et les active en permanence. Ils pr\u00e9f\u00e9raient se passer de l\u2019argent du RSA activit\u00e9 plut\u00f4t que de soumettre au principe d\u2019un dispositif qui institutionnalise la cat\u00e9gorie de travailleurs pauvres et le pr\u00e9cariat comme condition socialement acceptable d\u00e8s lors qu\u2019il donne lieu \u00e0 des dispositifs de compensation financi\u00e8re [7].\n\nSur un tout autre plan, puisqu\u2019il s\u2019agit d\u2019une aide sociale locale, l\u2019offre de \u00ab panier solidaire \u00bb propos\u00e9e par des communes rencontre le m\u00eame type de non-recours. Cette aide prend place dans un ensemble de dispositifs destin\u00e9s aux m\u00e9nages les plus pr\u00e9caires, des secours financiers d\u2019urgence aux banques alimentaires, en passant par les \u00e9piceries et les vestiaires solidaires\u2026 Sur des sites que nous avons \u00e9tudi\u00e9s, le taux de non-recours avoisinait les 80%. L\u2019explication produite par les enqu\u00eat\u00e9s est sans appel. Ce cas lui aussi massif de non-demande intentionnelle s\u2019explique par un d\u00e9saccord avec la condition d\u2019acc\u00e8s qui subordonne l\u2019attribution des paniers \u00e0 leur pr\u00e9paration avec d\u2019autres habitants et des professionnels. En l\u2019esp\u00e8ce, les non-recourants, la plupart du temps des femmes, n\u2019ont pas envie de s\u2019exposer en public (la pr\u00e9paration des paniers est typiquement un cas d\u2019exposition publique qui cr\u00e9e de la stigmatisation). Ils ne partagent pas le principe qui innerve de plus en plus les disposit"} +{"output_text": " as the method that provides the individual with information about the values and goals of his own future self. The method of specific understanding is the method of anticipating the future state of affairs that will result from the actor's own and other actors' actions.\n\nThe method of specific understanding is thus a method of anticipating the future state of affairs that will result from the actor's own and other actors' actions. The method of specific understanding is thus a method of anticipating the future state of affairs", "input_text": " objectives of the \"technostructure\" of large business firms in acceding to these demands and deciding what part of the cost increase to pass on to consumers. Thus, according to Mises,\n\nIf what these disciplines [i.e., the aprioristic and the natural sciences] teach is insufficient or if the historian chooses an erroneous theory out of several conflicting theories held by the specialists, his effort is misled and his performance is abortive.\n\nBut what exactly is the historical method of specific understanding, and how can it provide true knowledge of a wholly subjective and unobservable phenomenon like human motivation? First of all, as Mises emphasizes, the specific understanding of past events is\n\nnot a mental process exclusively resorted to by historians. It is applied by everybody in daily intercourse with all his fellows. It is a technique employed in all interhuman relations. It is practiced by children in the nursery and kindergarten, by businessmen in trade, by politicians and statesmen in affairs of state. All are eager to get information about other people's valuations and plans and to appraise them correctly.\n\nThe reason this technique is so ubiquitously employed by people in their daily affairs is because all action aims at rearranging future conditions so that they are more satisfactory from the actor's point of view. However, the future situation that actually emerges always depends partly on the purposes and choices of others besides the actor. In order to achieve his ends, then, the actor must anticipate not only changes affecting the future state of affairs caused by natural phenomena, but also the changes that result from the conduct of others who, like him, are contemporaneously planning and acting. Understanding the values and goals of others is thus an inescapable prerequisite for successful action.\n\nNow, the method that provides the individual planning action with information about the values and goals of other actors is essentially the same"} +{"output_text": " required between the new value and the old value.\"\n\nThe value of money is not a natural value, but a social value. It is a value which is created by the act of society, and which is not inherent in the metal itself. It is a value which is created by the act of society, and which is not inherent in the metal itself. It is a value which is created by the act of society, and which is not inherent in the metal itself. It is a value", "input_text": ". The great natural difficulty which originally stood, in the way of exchanges is now the private property of a class, and this class cultivates this difficulty, and make money out of it, even as a farmer cultivates his farm and makes money by his labor. But there is a difference between the farmer and the usurer; for the farmer benefits the community as well as himself, while every dollar made by the usurer is a dollar taken from the pocket of some other individual, since the usurer cultivates nothing but an actual obstruction. You cannot monopolize corn, iron and other commodities as you can money, for to do so you would be obliged to stipulate in your sales that payment shall be made to you in those commodities. What a commotion would exist in the community if a company of capitalists should attempt permanently to monopolize all the corn! But money, by the nature of the case, since it is the only legal tender, is always monopolized. This fact is the foundation of the right of society to; limit the rate of interest. We conclude, therefore, that gold and silver do not furnish a perfect medium of circulation; that they do not furnish facilities for the exchange of all commodities. Gold and silver have a value as money; a value which is artificial, and created unintentionally by the act of society establishing the precious metals as a legal tender. This new artificial value overrides all intrinsic actual values, and suffers no mediation between itself and them. Now, money, so far forth as it is mere money, ought to have no value; and the objection to the use of the precious metals as currency is that, as soon as they are adopted by society as a legal tender, there is superadded to their natural value this new, artificial and unnatural value. Gold and silver cannot facilitate the purchase of his new value which is added to themselves; \"a mediator is"} +{"output_text": "ning.\n\nThe GAO reports are a series of \"studies\" that are actually studies of studies. The first GAO report, GAO-01-923T, was published in 2001. It was followed by GAO-02-923T, GAO-03-923T, GAO-04-923T, GAO-05-923T, GAO-06-923T, GAO-07-923T", "input_text": " happy to draw attention to it. Without requiring any explanation, merely citing the trillion-dollar coin is enough to draw ridicule upon those who rationally support the underpinning possibilities of truly public money supplies.\n\nBut not all is lost. Through the trillion-dollar coin, the public has implicitly been exposed to the underpinning fact that there is something structurally different about minting coins, versus printing money. The foreclosed trillion-dollar coin debate can now be exploited to explain and explode an ongoing one-dollar \"coin-swap\" controversy, transforming the public's vague awareness of some key distinction between coins and notes into a concrete realization of the fundamental difference between United States currency and Federal Reserve currency. This message is heavily underscored by underhand anti-coin activism at the Federal Reserve. Why do you think the 1979 Susan B. Anthony dollar was so similar to a quarter that its production soon had to be halted? An ex-Mint chairman's grouse re more recent \"barriers\" to distribution is at http://financialservices.house.gov/uploadedfiles/hhrg-112-ba19-wstate-pdiehl-20121129.pdf.\n\nIn complete contrast to their perfunctory ridiculing of the trillion-dollar coin concept, the Federal Reserve and the Treasury are so afraid of what the public might learn from the one-dollar coin that for 22 years they have painstakingly mentored a series of 8 GAO reports that grossly understate the financial benefit that would automatically accrue to the government by replacing all one-dollar Federal Reserve notes with one-dollar United States coins. In the GAO reports -- or rather, in regularly attached Treasury and Federal Reserve letters of comments -- what began as a successful strategy of damning the dollar-coin by falsified faint praise, has become falsified direct dam"} +{"output_text": " to us, but they don\u2019t really care about the climate,\u201d she said.\n\n\u201cThey don\u2019t really care about the climate because they are not really doing anything about it. They are not really doing anything about it because they are not really listening to us. They are not really listening to us because they are not really listening to the science.\u201d\n\nShe says she is not sure if she will vote in the next U.S. presidential election, but she is certain she", "input_text": " by actor and ex-governor of California Arnold Schwarzenegger.\n\nBut what did she make of the impact of the U.N. summit, the weekly student strikes, the protests where millions packed the streets worldwide to demand action against anthropogenic climate change?\n\n\u201cIt depends,\u201d she says in her usual matter-of-fact manner in an interview on board \u201cLa Vagabonde,\u201d a sailboat owned by a young Australian couple that will be her home for the next two to three weeks.\n\nShe is wearing an oversized black windbreaker emblazoned with the words \u201cUnite Behind The Science\u201d as heavy, freezing rain pounds the hull.\n\n\u201cIn one way, lots of things have changed, and lots of things have moved in the right direction, but also in a sense we have gone a few more months without real action being taken and without people realizing the emergency we are in,\u201d said the high-schooler, who will return to her education next year.\n\nShe expresses her admiration for the people she met \u201cwho are living at the front line, and who are experiencing and living through the first consequences of the climate emergency\u201d \u2014 such as fellow teen Tokata Iron Eyes of the Standing Rock Sioux Nation, who fought in vain to stop the construction of an oil pipeline on her homeland.\n\nWhat did she learn from Obama? \u201cIt depends on how you define learning. I got an experience and he explained things to me, how it was to be in his position, how things work, and so on, so that, I guess.\u201d\n\nHer assessment of the presidents and prime ministers she encountered at the U.N., meanwhile, was less than stellar.\n\n\u201cWorld leaders and people in power, politicians ask me for selfies and ask other climate activists for selfies because they want to look good next"} +{"output_text": " the fact is that Trump has been a sexist, misogynist, and a bully for decades.\n\nAnd that's why it's so important to hold him accountable.\n\nTrump's sexism is not a new phenomenon. He's been a sexist for decades.\n\nTrump's sexism is not a new phenomenon. He's been a sexist for decades.\n\nIn the 1980s, Trump was a real estate developer who was sued by a former employee for", "input_text": " at an editorial board meeting.\n\nHe told biographer Timothy L. O'Brien that his favorite part of Pulp Fiction is the scene \"when Sam has his gun out in the diner and he tells the guy to tell his girlfriend to shut up. Tell that bitch to be cool. Say: \u2018Bitch, be cool.\u2019 I love those lines.\"\n\nis the scene \"when Sam has his gun out in the diner and he tells the guy to tell his girlfriend to shut up. Tell that bitch to be cool. Say: \u2018Bitch, be cool.\u2019 I love those lines.\" Of Arianna Huffington, he declared, \"I fully understand why her former husband left her for a man \u2014 he made a good decision.\"\n\n.@ariannahuff is unattractive both inside and out. I fully understand why her former husband left her for a man- he made a good decision. \u2014 Donald J. Trump (@realDonaldTrump) August 28, 2012\n\nAnd we haven't even gotten to Trump's comments during this presidential cycle. When Fox News's Megyn Kelly confronted him on his sexist comments, he laughed them off, saying to the applause of the audience that he only insulted Rosie O'Donnell, and declaring, \"I think the big problem this country has is being politically correct.\"\n\nAfter the debate, he turned on Kelly, telling CNN, \"She gets out and she starts asking me all sorts of ridiculous questions. You could see there was blood coming out of her eyes, blood coming out of her... wherever.\" He retweeted someone calling her a bimbo:\n\nAnd there are certainly plenty more instances of Trump misogyny the above list misses.\n\nYou can nitpick this or that item on the list. You can point to other parts of Trump's record to claim he sometimes support women. But"} +{"output_text": " that the Tories are the party of the working class, the reality is that the Tories are the party of the rich.\n\nThe Tories are the party of the rich.\n\nThe Tories are the party of the rich.\n\nThe Tories are the party of the rich.\n\nThe Tories are the party of the rich.\n\nThe Tories are the party of the rich.\n\nThe Tories are the party of the rich.\n\n", "input_text": ", the median age for American motorcyclists is 47. In 1990, it was 32. In January, the company\u2019s postings showed worldwide retail had fallen 6.7 per cent in 2016, with U.S. sales dropping 8.5 per cent.\n\nChristie, who rides a customized 1998 Harley-Davidson Ultra Classic, said he thinks the motorcycle maker is making a smart business move and will retain the affection of hard-core bikers.\n\nHe said he isn\u2019t impressed with how Trump managed to avoid the draft because of a medical condition, while many bikers enlisted.\n\n\u201cWhat did he have? Shin splints?\u201d Christie asked.\n\nCampbell also said he isn\u2019t impressed with Trump, despite his fond comments for bikers.\n\n\u201cI\u2019m not a politician so I can\u2019t really comment other than I think there\u2019s better people,\u201d Campbell said. \u201cHe has already stated that he doesn\u2019t read. How do you become knowledgeable if you don\u2019t read?\u201d\n\nChristie said he fears that Trump\u2019s tirades are costing the U.S. prestige internationally.\n\n\u201cWe set the standard for the world,\u201d Christie said. \u201cThat seems to have slipped away from us.\u201d\n\nWith files from Star wire services.\n\nRead more about: The Basic Income is as vital now as it was in the election. Anti austerity demos are planned, with a big one on 20th June. But without the Basic Income, the message is \u2018Put the Clock back\u2019. Sorry, the anti-scroungers have won that argument. Rachel Reeves, who would have seamlessly taken over from Iain Duncan Smith, Ed Miliband, and now Andy Burnham, Labour leadership contender, all agree that the workshy are a problem. So although the speeches will say"} +{"output_text": "as Fogg, the teacher set off on his epic journey from the Irish Sea to the Pacific Ocean, pedalling his way across the globe.\n\nThe teacher, who has been teaching for 30 years, said: \"I'm not a great cyclist but I'm a great teacher and I love my job.\n\n\"I'm not a great cyclist but I'm a great teacher and I love my job.\n\n\"I'm not a great cyclist but I'm a", "input_text": " Gogodermo experience, I'd have told them that they were liars and fools. 'But it's true! It's not called 'The Land of the Free' for nothing!\n\n\"You can more or less say what you want, think what you want and run the 100m in less than 10 seconds without having a rifle aimed at you or without having the long arm of the law on your case. I love everything about America! It\"s big, it's free and it's just wonderful!\"\n\nWhile Dermot has been out travelling across the world on hi 40,000km adventure, he has had lots of time to contemplate what he will do with the rest of his retirement, once he gets back home to Fingal, Dublin.\n\nWell the Rush man's future now seems to be set and Dermot now intends to open up a cafe and grocery shop with a difference in Skerries, this summer.\n\nHe said: \"The 'Gogodermo, Around the World Cafe and Grocery Shop' will open in Skerries this summer!'\n\nExplaining the concept of this environmentally friendly and healthy business, he said: \"If you're a green shopper, then you'll support Ireland's first 100% No Waste Cafe and Grocery Shop which will open in Skerries this summer.\"\n\nThe challenge of a lifetime began last July for Dermot as the retiring teacher left his classroom at Rush Lusk Educate Together National School and quite literally got on his bike to pedal the first few metres of a potentially record-breaking 40,000km cycle around the world.\n\nThere was a huge send off for the retiring teacher at the school as headed off on the greatest adventure of his life after closing the book on a long career in teaching.\n\nInspired by childhood tales of Phile"} +{"output_text": " expense of traditional campaigning. We had a strong ground game in place, with over 1,000 volunteers in the field. We also had a strong presence on social media, with over 1.5 million people following our campaign on Facebook.\n\nWe also had a strong presence on social media, with over 1.5 million people following our campaign on Facebook.\n\nWe were also able to use our digital campaigning to engage with voters in a way that traditional campaigning never could. We were able to", "input_text": " us to get off to a flying start.\n\nOn day one, we had our snap general election campaign pack ready to go. Soon after, support to local teams was delivered through targeted messaging at both national and constituency level. Alongside this, localised policy information, statistics and campaign materials supported activists not just with national political messages but with individual resources to fight in their areas, bringing their campaigns to life.\n\nInformed by the lessons of recent elections, ours was an innovative approach to campaigning. Behind the scenes the teams in HQ had been developing a range of new tools over the last year, and the snap election gave us the opportunity to deploy them for the first time. We placed digital at the heart of our activities, with HQ teams organised into separate campaign and organisational arms. The former led on delivering the right message to the right voters, the latter on engagement and mobilisation.\n\nCentral to this new approach was Promote, which linked our voter database technology with Facebook. This cutting-edge tool allowed us to deploy targeted messaging to key voters on a more personal and localised level than ever before.\n\nWe put unprecedented levels of funding into online advertising, supported by a highly professional data targeting operation that gave us an edge in getting the right messages in front of the right voters. This allowed us to make quick decisions about who and where to target.\n\nWhen we saw improved local canvass returns in Sheffield Hallam, we were able to target anti-Lib Dem Facebook messages at all the voters in the seats that we thought were being contested between Labour and the Lib Dems. In the last week alone 24 million people viewed our shared content on Facebook. And our fundraising reached new highs too. Over \u00a34m was raised in small donations, most of it from 220,000 online donations.\n\nWhile digital came to the fore, it was not at the"} +{"output_text": " preferable to the loop of bad customer service. \u201cAny justification that allows us to close that narrative leads to [fewer] rage-inducing interactions,\u201d Karmarkar says.\n\nThe Loop of Bad Customer Service Is a Real Thing\n\nThe loop of bad customer service is real, and it's not going away. \u201cIt's a real thing,\u201d Karmarkar says. \u201cIt's not going away.\u201d\n\nBut it's not a problem that's going to go", "input_text": " into an endless loop with a service rep clearly just reading from a script and not offering any real help as you rehash the problem, \u201cyou don't have a path forward, you don't have any way to respond or fix the problem,\u201d she says. \u201c[And] we don't like cliffhangers \u2026 there's a reason we move to edge of our chairs when music reaches a crescendo.\u201d\n\nWhile this can be seen even in a much more entertaining situation \u2014 Karmarkar cites a scene in \u201cWho Framed Roger Rabbit\u201d where it's impossible to resist ending the \u201cshave and a haircut...\u201d jingle \u2014 it's infuriating when we're left hanging with no power to change things.\n\nGet the better newsletter. This site is protected by recaptcha\n\nBad Customer Service Can Feel Like an Unresolved Loop\n\nThe nasty encounters generate all these negative emotions, Karmarkar says, and emotions have a duration. \u201cIf you can't handle them, process them or switch them,\u201d of course they're going to hang on. Since we can't just set them aside, \u201cI would imagine that is what can poison the rest of the day.\u201d\n\nAny justification that allows us to close that narrative leads to [fewer] rage-inducing interactions.\n\nThings can take an even more drastic turn when we move through the various channels of communication. Interacting with a real live person \u2014 face to face \u2014 is much easier to navigate, Karmarkar says, thanks to nonverbal cues like facial expression and body language. Much is lost on the phone, not to mention in online chat support \u2014 where now we have to wonder if we're even talking to a human or a robot.\n\nThe awful feeling that we're not being acknowledged is so bad, in fact, that even a negative resolution may be"} +{"output_text": " the \u201ccultural turn\u201d in Marxism.\n\nThe Frankfurt School\u2019s humanism was not a reaction against Stalinism, but rather a reaction against the humanism of the Second International, which was a reaction against the humanism of the First International. The Second International was the product of the First, which was the product of the French Revolution. The French Revolution was the product of the Enlightenment, which was the product of the Protestant Reformation. The Protestant Reformation was the product of", "input_text": " with. The Junker\u2019s existence is threatened by the burgher in a way that the proletarian\u2019s isn\u2019t. No surprise the prole would want to emulate his \u201cclass enemy,\u201d the burgher, especially since there were no hereditary restrictions to accomplishing this, and as it became easier with increasing consumerization and expansion of credit.\n\nThe Frankfurt School were not the key culprits behind Cultural Marxism, except insofar as they served as prime examples of the general humanist shift in Western Marxism after the Western disillusionment with Stalinism. This tendency, however, begins with the German Marxist Karl Korsch\u2019s calls for a return to Hegel and phenomenology in the 1920s, marking the start of Marxism becoming ever more an intellectual hobby of academics rather than an intellectual framework for revolutionary praxis. So too with Gyorgy Lukacs revisiting dialectics, and thus introducing Marxist-humanist preoccupations with \u201crationalization,\u201d (influenced by Weber) \u201cascribed consciousness\u201d and most famously \u201creification\u201d \u2014 an idea that would become one of the ultimate ancestors of the modern academic preoccupation with social constructionism. Later on we get Marcuse\u2019s more explicitly Freudian themes of \u201crepression\u201d and \u201csublimation.\u201d The concept of \u201calienation\u201d is shifted from the economic connotations of commodification of labor-power (an artifact of the division of labor more generally) into one of psychological malaise, the \u201cexternal objectification of human objectivity\u201d as Korsch dubbed it. Even Louis Althusser, who was supposedly the architect of a materialist reaction against Marxist humanism, ended up strengthening humanism through his concept of \u201cideological state apparatuses,\u201d becoming a corollary to Gramsci\u2019s \u201ccultural hegemony\u201d and hence an impetus for"} +{"output_text": " computer's growing awareness of its own mortality. The computer's awareness of its own mortality is the result of its own self-awareness, which is the result of its own self-reflection. The computer's self-reflection is the result of its own self-reflection, which is the result of its own self-reflection, which is the result of its own self-reflection, which is the result of its own self-reflection, which is the result of its", "input_text": " film or even known to the directors. Baudrillard's \"On Nihilism\" goes on to describe the destruction of meaning via postmodernism once meaning has been destroyed by appearances, but once both meaning and appearance has been destroyed, what is left? In the midst of a theoretically destructed and deconstructed society no images, signs, or sign systems are available for the act of construction that seems so inevitable to human thinking. The Wachowski brothers' appropriation of religious imagery to meet this need is telling. It is quite possible that The Matrix Trilogy not only points to the past and present future of science fiction, but to the past and present future of religion; it seems that their film series asserts that the dialectic of enlightenment governing the early 21st century is a dialectic engaging both instrumental reason and mystical religious experience. This question can only begin to be answered, however, via an analysis of the films in the light of Baudrillard.\n\n\n\nIn the pre-history of The Matrix Trilogy, which also finds exposition in the Animatrix film shorts, human computer technology developed to the point of creating an artificial intelligence; a thinking, willing, self-determined, conscious computer. This computer continued to learn and grow, \"spawning a whole race of machines\" (Matrix 1.7), gaining influence over human society incrementally to the point of almost total control. Human revolt took the form of an atomic cataclysm initiating a nuclear winter intended to block sunlight from the surface of the Earth and shut down the solar-powered computer. The plot, to this point, is unoriginal. The Terminator films operate on the same premise. It is the extension of the war into the minutia of human consciousness that generates an aura of mystical enlightenment over the film, adding to its widespread appeal. This extension of control takes place in response to the"} +{"output_text": "le Obama is a fan.\n\n\"I don't know if she's a fan. I don't know if she's a fan. I don't know if she's a fan. I don't know if she's a fan. I don't know if she's a fan. I don't know if she's a fan. I don't know if she's a fan. I don't know if she's a fan. I don't know if she's a fan. I", "input_text": " 2004. He had some excruciating headaches and suffered short-term memory loss.\n\n\"And,\" he adds, looking at me, \"I suffered short-term memory loss.\"\n\nYou may Google yourself from time to time, but George Clooney doesn't. How could he? It's different for him. It's overwhelming. Its infinite madness could disintegrate a man's personality. \"George Clooney\" pops up on nearly 11 million sites on the Internet. Spend a day browsing these sites and you will find unfathomable rage and baffling adoration. You will find America with all its insane colors refracted through the prism of George Clooney.\n\nBut George Clooney is also a brave man, and today he has agreed to spend a couple of hours exploring what the Internet has to say about George Clooney. A sort of This Is Your Virtual Life. Today he will see things that shock him, scare him, and make him shake with laughter. He will see things so disturbing that he will walk out of the room horrified. Also, he will see his own nipples.\n\nBut for now, a little after 9:00 a.m., as we zip down the FDR Drive to the loft where the Esquire photo crew is waiting, he's reading aloud his comparatively friendly user-generated biography:\n\nHe secretly financed and executive-produced a political thriller short film called The Endgame Study in 2006.\n\n\"Never heard of that. It was so secret, I have no idea what they're talking about.\"\n\nIt is rumored that Clooney was the one to have circulated the videotape of Jesus vs. Santa (the video greeting card that gave birth to South Park) around the Los Angeles area in 1995.\n\n\"There's truth to that.\"\n\nMichel"} +{"output_text": " some have used a combination of all three.\n\nThe evidence, however, is that the use of these tools has been limited. The main reason is that the crisis has been so severe that the authorities have been forced to use all of them. But the evidence also suggests that the use of capital controls has been limited, and that the use of foreign exchange intervention has been limited.\n\nThe evidence on capital controls is that they have been used in a limited way. The evidence on foreign exchange", "input_text": " however, against relying on exchange rate adjustment. The first is that, to the extent that domestic borrowers have borrowed in foreign currency, the depreciation has adverse effects on balance sheets, and leads to a decrease in domestic demand that may more than offset the increase in exports. The second is that much of the nominal depreciation may simply translate into higher inflation. The third is that large movements in the exchange rate may lead to disruptions, both in the real economy and in financial markets.\n\nThe evidence, however, is that the first two are much less relevant than they were in previous crises. Thanks to macroprudential measures, to the development of local currency bond markets, and to exchange rate flexibility and thus a better perception by borrowers of exchange rate risk, foreign exchange exposure in emerging market countries is much more limited than it was in previous crises. And thanks to increased credibility of monetary policy and inflation targets, inflation expectations appear much better anchored, leading to limited effects of exchange rate movements on inflation.\n\nHowever the third argument remains relevant. And this is why central banks in emerging market countries have not moved to full float, but to \u201cmanaged float,\u201d that is the joint use of the policy rate, foreign exchange intervention, macroprudential measures, and capital controls. This has allowed them to reduce the old dilemma that arises when the only instrument used is the policy rate: an increase in the policy rate may avoid the overheating associated with capital inflows, but at the same time, it may make it even more attractive for foreign investors to come in. Foreign exchange intervention, capital controls, and macro prudential tools can, at least in principle, limit movements in exchange rates, and disruptions in the financial system without recourse to the policy rate. Countries have used all of these tools in this crisis. Some have relied more on capital controls, some more on foreign exchange intervention. And"} +{"output_text": "aked and invisible for a short period of time\n\nEnemies will be cloaked and invisible for a short period of time Sticky Grenade: Enemies will be stuck to the ground and unable to move\n\nEnemies will be stuck to the ground and unable to move Flame Thrower: Enemies will be able to throw fireballs\n\nEnemies will be able to throw fireballs Flame Thrower: Enemies will be able to throw fire", "input_text": " adding more Weeklies to the Challenge lineup so you have more flexibility to progress on your schedule.\n\nFinally, we\u2019ve made several of last Season\u2019s rewards, like Lunchboxes, the Ammo Converter, available at Gold Bullion vendors so players who missed them have another opportunity to unlock them.\n\nDaily Ops\n\nPut your skills to the test by taking on instanced, randomized, and repeatable encounters called \u201cDaily Ops.\u201d Daily Ops are fairly challenging, and so we recommend them for characters who are level 50+.\n\nOur first Game Mode for Daily Ops is called \u201cUplink,\u201d which will require you to secure a series of Uplinks to track down and destroy enemies that are threats to Appalachia.\n\nTo join a Daily Op, open the Map screen and find the new World Activity tracker we\u2019ve added in the lower-left corner.\n\nEach day, Daily Ops will feature a fresh combination of location, enemy faction, and enemy mutations to keep you on your toes.\n\nLocations: The Burrows, The Burning Mine, Valley Galleria, or Vault 94\n\nThe Burrows, The Burning Mine, Valley Galleria, or Vault 94 Enemy Factions: Super Mutants, Blood Eagles, or Robots\n\nSuper Mutants, Blood Eagles, or Robots Enemy Mutations:\n\nPiercing Gaze: This mutation is always applied to enemies the \u201cUplink\u201d Daily Ops mode, and gives them greatly enhanced perception of players.\n\nThis mutation is always applied to enemies the \u201cUplink\u201d Daily Ops mode, and gives them greatly enhanced perception of players. Volatile: Enemies will explode on death\n\nEnemies will explode on death Active Camouflage: Enemies will be clo"} +{"output_text": "re Dame\n\nForward\n\nJackson is a bit of a wild card. He's a bit of a tweener, but he's a bit of a tweener who can shoot. He's a bit of a tweener who can shoot. He's a bit of a tweener who can shoot. He's a bit of a tweener who can shoot. He's a bit of a tweener who can shoot. He's a bit of a tween", "input_text": ".\n\nVideo: Prospect breakdown\n\nMalik Beasley\n\nFlorida State\n\nFreshman\n\nGuard\n\nThe Hornets, like so many teams in the league, are in need of shooters. Nicolas Batum and Courtney Lee are both free agents. Even if the Hornets re-sign one, landing Beasley would be a real steal for Charlotte.\n\nHis shooting ability and athleticism would make him a perfect fit here.\n\nVideo: Prospect breakdown\n\n23. Boston Celtics\n\nThon Maker\n\nAustralia\n\nAge: 19\n\nForward\n\nThis is the third first-round pick for the Celtics, and on the slim chance they keep all three, this gives Ainge a chance to swing for the fences.\n\nMaker could end up rising dramatically in the next few weeks with great workouts. He nailed his measurements, athletic testing and interviews at the combine. If he plays well in workouts, the Celtics might have to grab him at No. 16. If he struggles? He could end up in the second round.\n\nVideo: Prospect breakdown\n\n24. Philadelphia 76ers (via Heat)\n\nDejounte Murray\n\nWashington\n\nFreshman\n\nGuard\n\nDejounte Murray might have as much upside as any point guard in the draft after Jamal Murray and Dunn are off the board. He has great size, toughness and a knack for seeing the floor. He also can be wild, and his jump shot is erratic.\n\nMurray has bust potential, but he also carries some of the ingredients to be a star at the next level if he develops. That's exactly the type of player the Sixers need to find right now.\n\nVideo: Prospect breakdown\n\nDemetrius Jackson\n\nNot"} +{"output_text": ".\"\n\nSprint is also likely to use Airspan's 2.5 GHz LTE solution for macro coverage, according to analysts.\n\nSprint's NGN deployment is likely to be a \"hybrid\" solution, according to analysts. Sprint is likely to use Airspan's 2.5 GHz LTE solution for macro coverage, according to analysts.\n\n\"We believe that Sprint will use Airspan's 2.5 GHz LTE solution for macro coverage,\" said Chaplin. \"We", "input_text": " well as several thousand macro sites.\" They added that \"our checks have indicated that in one of the iterations of S's NGN plans, the company had proposed a 70K small cell buildout--which was supposedly won by Mobilitie.\"\n\nA Mobilitie spokeswoman did not respond to a request for comment.\n\nIt's unclear how widespread the small cell deployment will be across the country as well as how many macro cell sites will be involved. Sprint might also look to repurpose decommissioned Clearwire sites for CDMA and LTE service, according to analysts.\n\nAlthough Mobilitie might be deploying many of the small cells involved in Sprint's NGN, the carrier is likely going to be buying gear and small cells from at least three vendors, according to analysts: AirSpan, Nokia (NYSE:NOK) and Samsung. Nokia and Samsung are two of Sprint's vendors for its tri-band \"Spark\" LTE service and Nokia is a key supplier of 8T8R radios Sprint has been using for macro coverage. Nokia and Samsung representatives did not respond to quests for comment\n\nAirSpan, a privately held wireless company based in Boca Raton, Fla., which formerly developed WiMAX products, has developed an LTE product that uses 2.5 GHz for both radio access as well as backhaul. AirSpan declined to comment.\n\n\n\nBecause acquiring fiber for backhaul is prohibitively expensive, Sprint is likely going to use wireless backhaul to save money, analysts said. Both BTIG analyst Walter Piecyk and New Street Research analyst Jonathan Chaplin said that Sprint will likely use in-band wireless backhaul solutions using the lower 2.5 GHz spectrum band that would not require line of sight. Piecyk also wrote in a blog post last week that Sprint could \"allocate a chunk of the 2.5 GHz spectrum for backhaul"} +{"output_text": "est bloqu\u00e9 par Facebook.\n\n\u00ab Nous ne pouvons pas bloquer tout le monde \u00bb, a d\u00e9clar\u00e9 Facebook \u00e0 l\u2019AFP. \u00ab Nous avons des r\u00e8gles tr\u00e8s strictes pour \u00e9viter que des contenus illicites soient diffus\u00e9s sur Facebook. \u00bb\n\n\u00ab Nous avons des r\u00e8gles tr\u00e8s strictes pour \u00e9viter que des contenus illicites soient diffus\u00e9s sur Facebook. \u00bb\n\n\u00ab Nous ne pou", "input_text": " Les contenus \u00bb de Nordpresse \u00ab sont autoris\u00e9s sur Facebook \u00bb, a fait savoir le r\u00e9seau social dans un communiqu\u00e9 diffus\u00e9 dimanche en fin d\u2019apr\u00e8s-midi.\n\nEn revanche, \u00ab nous avons identifi\u00e9 un probl\u00e8me technique emp\u00eachant l\u2019affichage d\u2019un aper\u00e7u \u00bb lorsque les internautes cherchaient \u00e0 partager un lien sur le site de Nordpresse, a ajout\u00e9 le r\u00e9seau social. Ce probl\u00e8me \u00ab est en train d\u2019\u00eatre r\u00e9par\u00e9, et nous nous excusons pour la g\u00eane occasionn\u00e9e \u00bb.\n\nLes messages des internautes qui ont partag\u00e9 des articles du site retir\u00e9s par la suite indiquaient simplement que ces publications \u00e9taient \u00ab ind\u00e9sirables \u00bb au sens du r\u00e9seau social, sans autre pr\u00e9cision.\n\nPendant la nuit, Facebook fait le m\u00e9nage. https://t.co/yUyt8YVRMX \u2014 KatellFavennec (@Katell Favennec)\n\nIl n\u2019est donc pas possible de savoir en l\u2019\u00e9tat les raisons qui ont amen\u00e9 Facebook \u00e0 bloquer le partage des articles de Nordpresse.be. Ce manque de transparence dans la mise en \u0153uvre de telles d\u00e9cisions \u00e9ditoriales de la part de l\u2019entreprise est r\u00e9guli\u00e8rement critiqu\u00e9 en France comme ailleurs dans le monde.\n\nConcernant les accusations de \u00ab censure politique \u00bb \u00e0 l\u2019encontre du r\u00e9seau social, il faut n\u00e9anmoins pr\u00e9ciser qu\u2019elles se heurtent au fait qu\u2019aucun autre site, qu\u2019il soit satirique ou non, n\u2019"} +{"output_text": " the men could take before they fled the scene.\n\nThe next day, the Indianapolis Journal reported that the creature had been seen by a number of people in the area, including a local farmer who claimed to have seen the creature \u201cswimming\u201d in the nearby river. The farmer also claimed that the creature had been seen by a number of people in the area, including a local farmer who claimed to have seen the creature \u201cswimming\u201d in the nearby river.\n\nThe creature was also", "input_text": " but that they were likely responsible for the plethora of inexplicable animal mutilations as well as the scores of human beings who are reported missing every year. It goes without saying that the thought of huge, voracious, virtually undetectable predators that can descend from the sky in a flash to claim their unwary victims is not a comforting one.\n\nPerhaps it was Constable\u2019s alarming theory that inspired Japan\u2019s Toho Studios to produce \u201cDogora, the Space Monster.\u201d Dogora, a massive, floating jellyfish-like creature that hovered over Japan scooping up its terrified victims with long, whipping tendrils, was brought to life by renowned Godzilla collaborators director Ishir\u014d Honda and special-effects wizard Eiji Tsuburaya in 1964, and may well be the best cinematic articulation of an atmospheric monster ever created.\n\nWhile Dogora is well known to cult and kaiju film enthusiasts, arguably the most celebrated real life encounter involving an alleged atmospheric monster hailed from Crawfordsville, Indiana. According to the account published in the September 5th, 1891 edition of the Indianapolis Journal, at about 2 am. on September 4th, two men were repairing a wagon when they looked skyward and were shocked to see what they described as a \u201chorrible apparition\u201d soaring above them.\n\nThe men asserted that the multi-finned, rectangular, headless creature \u201cswam\u201d no less than 100-feet above them and they gauged its size to be approximately 8-feet wide and 20-feet in length. The men would later confirm to reporters that the beast was definitely animate.\n\nThe men watched in horror as the creature propelled itself through the heavens with its numerous fins and even circled above a nearby home. The monster then vanished as it traveled eastward, only to reappear moments later. This was about all"} +{"output_text": "\n\nThe other advantage you will have over traditional dispensaries is that you will be able to offer a wider variety of products. Traditional dispensaries are limited to offering flower, concentrates, and edibles. If you are able to offer a wider variety of products, you will be able to offer a wider variety of products to your customers.\n\nThe final advantage you will have over traditional dispensaries is that you will be able to offer a wider variety of products. Traditional dispensaries are limited", "input_text": ". At a minimum, you will likely need to create your own edibles, oils, and wax. This will require a significant investment into equipment and fixtures, as well as investment into your production facility to comply with fire code and other building requirements.\n\nAbsent investing in processing space and equipment, one way around this would be to enter a microbusiness collective, which would essentially consist two or more microbusinesses operating out of the same property. The advantage to this is that there could be significant strength in numbers. Similar to how bars generally do better when there are other bars located nearby, we believe that microbusinesses could potentially do better when paired with other complementary microbusinesses.\n\nFor example, let\u2019s assume there is a building with four different microbusinesses located within it. Two of those microbusinesses could sell flower\u2014maybe one sells a few of the most popular strains and the other sells more unique, tailored strains such as high-CBD flower. The other two businesses could then focus on processing, with one potentially focusing on edibles\u2014i.e. a green bakery, which could itself be successful as a standalone business\u2014and the other on oils and wax. If this were the case, the microbusiness collective could sell all of the products that a dispensary could sell, overcoming one of the weaknesses of the microbusiness dispensary model. The microbusinesses would also be able to save money by sharing certain costs\u2014e.g. advertising and marketing costs, certain personnel costs, and more.\n\nAssuming you are able to overcome these hurdles, you will be in a position to compete with traditional dispensaries. One advantage you will have over traditional dispensaries is that your microbusiness can market itself as locally produced \u201ccraft\u201d cannabis. All things being equal, most consumers prefer to shop local and support local businesses over larger conglomerates."} +{"output_text": "OzM0bScbWzBtX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f", "input_text": "SAbWzIxQxtbMTszNG0uLGNka2tPT09r bzsuDQobWzBtICAgX19fX19fXxtbOEMbWzE7MzRtLiwgG1swbV9fX19fX19fG1s1Q19fX19fX19f G1s2Q19fX19fX18NCiAgIFwgIF9fXy9fIF9fX18gG1sxOzM0bScbWzBtX19fXBtbNkMvX19fX19c G1s2Q19fX19fX19cXyAgIF8vXw0KICAgLyAgXy8gICBcXCAgIFwvICAgLyAgIF9fLxtbNUMvLyAg IHwgIFxfX19fXy8vG1s3Q1wNCiAgL19fX19fX19fXz4+G1s2QzwgX18vICAvICAgIC8tXCBfX19f IC8bWzVDXCBfX19fX19fLw0KIBtbMTFDPF9fXy9cX19fPiAgICAvX19fX19fX18vICAgIC9fX19f X19fPg0KIBtbNkMbWzE7MzRtLmRkYzssLDpjOy4bWzlDG1swbSxjOhtbOUMbWzM0bS5jeHhjOjs6 b3g6DQobWzM3bSAbWzZDG1sx"} +{"output_text": " been written about it.\n\nIf you see a brand that is not cruelty-free but is owned by a parent company that is cruelty-free, then it is noted.\n\nIf you see a brand that is not cruelty-free but is owned by a parent company that is not cruelty-free, then it is noted.\n\nIf you see a brand that is not cruelty-free but is owned by a parent company that is not cruelty-free, then it is noted.", "input_text": " Cruelty-Free Brand List or to be Logical Harmony Approved.\n\nThe Logical Harmony Cruelty-Free Brand list was last updated on September 11, 2020.\n\nWant to search this list?\n\nPress Ctrl + F on Windows or Command + F on a Mac to open a search box. Type in the brand name you wish to find and hit \u201cEnter\u201d. If the brand is listed, your browser should scroll to it automatically.\n\nStill can\u2019t find a brand on this list?\n\nThe brand you\u2019re looking for might be on the Brands to Avoid list, the Pending List, or the Grey Area list. (Click each list to view it.) If you have searched all the lists and cannot find the brand you are looking for then you can submit a brand here. Often times, brands aren\u2019t aware that being cruelty-free and being Logical Harmony approved matters to their customers. Let them know that it does by sending them an email, tweet, or leaving and IG comment.\n\nCruelty-Free Brands\n\nAll brands on this list are cruelty-free. This means there is no animal testing done at any point, by anyone. This goes from the ingredients all the way to the finished products out on the market.\n\nHow to read the list:\n\nBrands listed in bold offer only vegan products.\n\nIt is noted if a brand is cruelty-free but is owned by a parent company that is not cruelty-free.\n\nIf you see a brand in a font that isn\u2019t black, that\u2019s a link to content here on Logical Harmony where the brand is included or featured. That way you can see it in action, find out where it\u2019s sold, find out what products they make, or read any reviews that have"} +{"output_text": " competition. There are a number of broth-like beverages on the market, including bone broth-flavored soups, bone broth-flavored smoothies, and bone broth-flavored protein powders. There are also a number of bone broth-flavored foods, including bone broth-flavored crackers, bone broth-flavored cookies, and bone broth-flavored ice cream. There are even bone broth-flavored cocktails, like the", "input_text": ". \u201cIf some people want to say I\u2019m raping and pillaging and selling meat-flavored water, that\u2019s fine,\u201d the Italian chef told Grub Street in 2015. (Canora has, to his credit, always been straightforward about the product he is selling.) Business remains brisk.\n\nUnlike cronuts, rainbow-colored bagels, and other Instagram-optimized food fads, bone broth has proved resilient and versatile, and continues to gain steam as a much-hyped health beverage. Bone broth, also unlike many of its trendy counterparts, was not concocted by wily advertising-types specifically to photograph well on Instagram. It wasn\u2019t concocted at all, at least not any time in the recent past. It is simply slickly marketed broth, the same kind that people have been drinking for (at least) thousands of years. Many cultures claim a variation of broth as a dietary essential, from Korea\u2019s milky seolleongtang to British beef tea, which was trendy enough in 1865 to get a New York Times write-up. (\u201cBrodo,\u201d meanwhile, is Italian for \u201cbroth.\u201d)\n\nIn addition to feeling familiar to wide swaths of people, the commodification of broth dovetailed with the rise of the Paleo diet, which advises followers to adhere to an eating pattern heavy in vegetables, fruits, eggs, healthy oils, seeds, and meat, and to avoid processed foods, dairy, and legumes. There is no one set of dietary restrictions on the Paleo diet, but the emphasis is placed on consuming foods that were available to cavemen. Bone broth fits the bill perfectly, as it is just bones and water, and so many people in the Paleo community have enthusiastically claimed it as a beloved staple.\n\nBrodo now grapples with plenty of"} +{"output_text": " clear whether the House will pass a continuing resolution (CR) to fund the government at FY17 levels, or whether the House and Senate will pass a CR that funds the government at FY16 levels.\n\nThe Senate will begin consideration of the FY17 appropriations bills on Tuesday, November 17. The House will begin consideration of the FY17 appropriations bills on Wednesday, November 18.\n\nThe House and Senate will consider the FY17 appropriations bills on Thursday, November 19.\n\n", "input_text": " years ago, but earlier remains have been found in Monte Verde in Chile. So a first group may have come down the coast, and later groups from the same source population followed inland, carrying the same genetic heritage.\n\nPerhaps the most evocative mystery that remains is the identity of the boy himself. His is the only known Clovis grave. The tools he was buried with \u2013 including one that was already 150 years old and fashioned from an elk bone \u2013 would have been priceless heirlooms to those who carried them. Yet they left them in the ground with a child.\n\nWe may never know who the Anzick child was, but scientists and local US tribes have agreed to lay him back to rest (see \u201cTribal healing: Anzick child genome changed my life\u201c). He will be reburied sometime in the next few months.\n\nJournal reference: Nature, DOI: 10.1038/nature13025\n\nLeader: \u201cAncient genome won\u2019t heal rifts with Native Americans\u201c Here is our list of space policy events for the week of November 14-19, 2016 and any insight we can offer about them. The House and Senate are in session this week.\n\nDuring the Week\n\nThe House and Senate return to work for one week beginning tomorrow (Monday). The House meets for legislative business Monday-Thursday; the Senate will be in pro forma session on Monday and meet for legislative business the rest of the week. Then they will recess again until after Thanksgiving.\n\nWith Republicans retaining control of both chambers, there will be less organizational work to prepare for the 115th Congress that convenes in January. The one \u201cmust do\u201d item between now and the end of the year is passing appropriations bill(s) to fund the government past December 9. As we wrote yesterday, it\u2019s not"} +{"output_text": "\nUnity of Command is also a matter of leadership. The anarchist strategy of \u201cthe enemy of my enemy is my friend\u201d is a recipe for disaster. The enemy of my enemy is my enemy. The enemy of my enemy is my enemy. The enemy of my enemy is my enemy. The enemy of my enemy is my enemy. The enemy of my enemy is my enemy. The enemy of my enemy is my enemy. The enemy of my enemy is my enemy. The enemy of my enemy", "input_text": " first with the most,\u201d this is what he was talking about. We must engage those in power where we are strong and they are weak. We must strike when we have overwhelming force, and maneuver instead of engaging when we are outmatched. We have limited numbers and limited force, so we have to use that when and where it will be most effective.\n\nEconomy of Force. \u201cAllocate minimum essential combat power to secondary efforts.\u201d In order to achieve superiority of force in decisive operations, it\u2019s usually necessary to divert people and resources from less urgent or decisive operations. Economy of force requires that all personnel are performing important tasks, regardless of whether they are engaged in decisive operations or not.\n\nManeuver. \u201cPlace the enemy in a disadvantageous position through the flexible application of combat power.\u201d This hinges on mobility and flexibility, which are essential for asymmetric conflict. The fewer a group\u2019s numbers, the more mobile and agile it must be. This may mean concentrating forces, it may mean dispersing them, it may mean moving them, or it may mean hiding them. This is necessary to keep the enemy off balance and make that group\u2019s actions unpredictable.\n\nUnity of Command. \u201cFor every objective, ensure unity of effort under one responsible commander.\u201d This is where some streams of anarchist culture come up against millennia of strategic advice. We\u2019ve already discussed this under decision making and elsewhere, but it\u2019s worth repeating. No strategy can be implemented by consensus under dangerous or emergency circumstances. Participatory decision making is not compatible with high-risk or urgent operations. That\u2019s why the anarchist columns in the Spanish Civil War had officers even though they despised rulers. A group may arrive at a strategy by any decision-making method it desires, but when it comes to implementation, a hierarchy is required to undertake more serious action.\n"} +{"output_text": ", a former priest who is now a lobbyist for the Archdiocese of Los Angeles, to ask him to lobby for the bill. \u201cI\u2019m not sure that it would,\u201d McComas said. \u201cI\u2019m not sure that it would do anything.\u201d\n\nThe Catholic Church is not the only organization that has concerns about the bill. The Maryland Catholic Conference, which represents the state\u2019s Catholic bishops, has also expressed concerns.\n\nThe Maryland Catholic Conference is concerned that", "input_text": " \u201cPrivate, religious and non-profit organizations would face dramatically greater risks of potentially devastating civil claims,\u201d according to the testimony. Quoting California Gov. Jerry Brown (D) when he vetoed a bill that would have suspended the statute of limitations for civil suits in child sexual assault cases, the testimony continued: \u201cThere comes a time when an individual or an organization should be secure in the reasonable expectation that past acts are indeed in the past and not subject to further lawsuits.\u201d The opposition matters. \u201cMaryland is a Catholic state, you know,\u201d said Del. Susan K. McComas (R-Harford). House Judiciary Committee Chairman Joseph F. Vallario Jr. (D-Prince George\u2019s) has refused to put the bill up for a vote. He did not respond to my request for comment.\n\nIt is always nice to see that the Catholic Church has its concerns properly ordered. It is better to be concerned about \u201cpotentially devastating civil claims\u201d than the fact that hundreds of thousands of children faced abuse at the hands of various Church officials, most of whom will go unpunished. Granted, the Church is in a unique position because they can be and have previously been held liable for the abuse committed by their officials. Given the scale of the abuse and the extent of the cover-up, it is very likely that they could face numerous lawsuits even if the alleged perpetrators are dead.\n\nThat does not explain all the refusal to support the bills. It also does not explain this kind of exchange:\n\nBut [Susuan] McComas, who said she \u201cmay\u201d support the bill if it were to come up for a vote, is skeptical that it would do any good. \u201cMoney isn\u2019t going to cleanse any souls,\u201d McComas said. She said that was what she meant when she emailed John Plaschke, 50"} +{"output_text": " suspicious. The company had previously said that the mission would be streamed live, and that the company would be providing a \u201cfull-resolution\u201d video feed. The company\u2019s explanation was that the video feed was being restricted because of a \u201ctechnical issue.\u201d\n\nBut the explanation didn\u2019t hold up. The company\u2019s own website, which had been updated to reflect the change, still said the live stream would be available. And the company\u2019s Twitter account, which had been updated", "input_text": " of these could be reproduced (only 36% could). Dr Smaldino and Dr McElreath therefore modified their model to simulate the effects of replication, by randomly selecting experiments from the \u201cpublished\u201d literature to be repeated.\n\nA successful replication would boost the reputation of the lab that published the original result. Failure to replicate would result in a penalty. Worryingly, poor methods still won\u2014albeit more slowly. This was true in even the most punitive version of the model, in which labs received a penalty 100 times the value of the original \u201cpay-off\u201d for a result that failed to replicate, and replication rates were high (half of all results were subject to replication efforts).\n\nThe researchers\u2019 conclusion is therefore that when the ability to publish copiously in journals determines a lab\u2019s success, then \u201ctop-performing laboratories will always be those who are able to cut corners\u201d\u2014and that is regardless of the supposedly corrective process of replication.\n\nUltimately, therefore, the way to end the proliferation of bad science is not to nag people to behave better, or even to encourage replication, but for universities and funding agencies to stop rewarding researchers who publish copiously over those who publish fewer, but perhaps higher-quality papers. This, Dr Smaldino concedes, is easier said than done. Yet his model amply demonstrates the consequences for science of not doing so. Friday morning, SpaceX was prepping for what should have been an otherwise routine launch \u2014 sending 10 satellites into orbit for longtime customer Iridium \u2014 when the company made a strange announcement. During the live stream leading up to the mission, a SpaceX employee explained that the company would have to cut off footage from the Falcon 9 rocket once the vehicle reached orbit. And the host said restrictions from the National Oceanic and Atmospheric Administration were to blame.\n\nViewers were immediately"} +{"output_text": " and the UK and is now available on Netflix\n\nThe Church of Scientology, which is believed to have around 50,000 members, denied the allegations featured in Going Clear and said the statements included were 'entirely false'\n\nThe Church of Scientology, which is believed to have around 50,000 members, denied the allegations featured in Going Clear and said the statements included were 'entirely false.'\n\n'The film is a work of fiction, and the statements made", "input_text": ". He said once members reached a certain level they would see founder L. Ron Hubbard's handwritten account of the creation myth.\n\nThis states that Zenu, a galactic dictator froze people and dropped their bodies into volcanoes, 75 million years ago. These spirits are said to have jumped into the bodies of newborns and are now used as the explanation for the source of all of our anxieties and fears. Hubbard is also said to have created 'Ethics' - a series of punishments for auditors who made mistakes.\n\nMr Haggis left the church citing problems with its stance on gay rights, after his two gay daughters told him how they were being treated by members. The Academy Award winning filmmaker wrote an infamous resignation to Tommy Miscavage, Scientology's chairman and Hubbard's successor, saying he was disappointed that he had failed to denounce actions of a San Diego church against gay people.\n\nThe documentary, based on a book by journalist Lawrence Wright, also claims that when the church thought Cruise was \u2018slipping away\u2019 during his marriage to Nicole Kidman, it worked with him to wiretap her phone, a claim that is flatly denied.\n\nIn an interview with Business Insider Gibney said he took great steps to ensure that those who spoke out were not put into a compromising position and he ensured he never filmed them at their homes or arrived at a meeting point at the same time.\n\nHe added: 'I often used throw-away phones and encrypted e-mail. People were so frightened.'\n\nThe Church of Scientology, which is believed to have around 50,000 members, denied the allegations featured in Going Clear and said the statements included were 'entirely false.'\n\nAlex Gibney (left) directed the documentary and interviewed senior former members and Paul Haggis (right)\n\nThe documentary was screened in America"} +{"output_text": " was a close call, but the Vikings were a bit more interesting.\n\nThe Vikings are a team that has been in the playoffs for the last two seasons, and they have a quarterback who is a bit more than a year removed from a Super Bowl appearance. The Vikings are a team that has a reputation for being a bit more than a bit of a bully, and they have a coach who is a bit more than a bit of a bully. The Vikings are a team that has a reputation", "input_text": " is obvious to a reasonable person in the circumstances, that those means are suitable for putting, and are intended to put, the invention into effect in the [the country].\n\nSee, e.g., UK Patent Act \u00a7 60(2). (Most European countries have similar provisions because their statutes are all based on the 1975 Community Patent Convention.) Although the statute requires actual or constructive knowledge, the details of that knowledge requirement are unsettled: must the infringer have knowledge of the specific patent or merely knowledge of the acts that the direct infringer will take. If courts only require knowledge of the induced acts, liability would attach regardless of any specific knowledge of the patent.\n\nIn short, creators of 3D printable files, especially those with knowledge of a relevant patent, should be wary of making them available for others on the internet.\n\nLove Your Neighbor, or At Least Your Reputation\n\nAlthough the coronavirus pandemic inflames passions when needed medical equipment is in short supply, it is important to remember that in emergencies Article 31 of TRIPS, the key international patent treaty, provides flexibilities for governments to use \u2013 and authorize others to use \u2013 patents without the consent of patent holders.\n\nThe above analysis focuses only on the legal issues. Even if the patent holder is more interested in making money than saving lives, it may be wise to consider the reputational and other costs associated with denying live saving equipment to hospitals in need. Others, including patent holders relating to vaccine development, have initially threatened patent infringement suits only to backtrack after a storm of public outrage. So even if the patent holder made a threat to someone in Italy (and it is not at all clear that it did), it would be no surprise that it decided to change tactics. Giants-Redskins was the easiest selection last week \u2014 a stress-free, 24-3 Lock. Giants-Vikings"} +{"output_text": "stavit, jak by se mohlo st\u00e1t, kdybychom se v\u0161ichni v t\u00e9to situaci rozhodli, \u017ee nechceme, aby se stalo to, co se stalo.\n\nP\u0159edstavte si, \u017ee se v\u0161ichni v t\u00e9to situaci rozhodneme, \u017ee nechceme, aby se stalo to, co se stalo.\n\nP\u0159edst", "input_text": " sd\u00edl\u00edme s plazy. Sledovat, jak v\u00e1m n\u011bkdo u\u0161kodil a prost\u011b to nechat b\u00fdt \u2013 nechat sk\u00f3re naklon\u011bn\u00e9 v jejich prosp\u011bch \u2013 je nesnesiteln\u00e9, t\u00e9m\u011b\u0159 fyzicky to bol\u00ed. Tak\u017ee ano, i kdyby T\u00fdm 6 zavrtal bin L\u00e1dinovi kulku do \u010dela 13. z\u00e1\u0159\u00ed 2001, do v\u00e1lky bychom stejn\u011b \u0161li. Po\u0159\u00e1d bychom na tabulce vid\u011bli 3000 smrt\u00ed na na\u0161\u00ed stran\u011b a prost\u011b bychom to nemohli nechat b\u00fdt.\n\nTak\u017ee a\u017e p\u0159\u00ed\u0161t\u011b zapnete zpr\u00e1vy a zjist\u00edte, \u017ee terorist\u00e9 p\u0159ipravili explozi, kter\u00e1 zas\u00e1hla deset d\u011bt\u00ed, tohle je prvn\u00ed krok: Uv\u011bdomte si, \u017ee ukazatel sk\u00f3re v\u00e1m l\u017ee. Bude se v\u00e1m sna\u017eit namluvit, \u017ee jedin\u00fd zp\u016fsob, jak vyhr\u00e1t, je ihned za\u010d\u00edt bombardovat, i kdy\u017e sami dob\u0159e v\u00edte, \u017ee krom\u011b padouch\u016f to odnese i stovka dal\u0161\u00edch d\u011bt\u00ed. Sk\u00f3re \u2013 to skute\u010dn\u00e9 sk\u00f3re \u2013 by pak bylo:\n\nN\u00e1sil\u00ed na d\u011btech 110, Lidskost 0\n\nTe\u010f p\u0159ich\u00e1z\u00ed ta \u010d\u00e1st \u010dl\u00e1nku, ve kter\u00e9 se budu sna\u017eit p\u0159ed"} +{"output_text": " of the day, it is up to politicians to raise the minimum wage.\n\n\u201cIt is not up to the Premier League to decide how much money is going to be distributed to the clubs. It is up to the clubs to decide how they want to use that money.\n\n\u201cI can\u2019t make any guarantees about what the clubs will do with it. I can\u2019t make any guarantees about what the clubs will do with it. I can\u2019t make any guarantees about what", "input_text": "Our \u2018Why are we doing this?\u2019 turned out to be \u2018We do this for a living.\u2019 \u201d\n\n\u201cAnd because our audiences provide me with the love that\u2019s so lacking in my life otherwise,\u201d Conniff said.\n\n\u201cFrom a distance and in a group,\u201d Beaulieu clarified.\n\nWeinstein continued, \u201cThere\u2019s no reason to ever quit. I mean, it\u2019s already sad. Look at us\u2014aging, fat, balding.\u201d He threw up his hands cheerfully. \u201cSo what the hell.\u201d\n\nOn December 30th, after six years, Cinematic Titanic performed its final show; the group finally got sick of it. As Conniff said at the Best Buy Theatre, \u201cIt kills my spirit a little to know that I\u2019ve watched \u2018Santa Claus Conquers the Martians\u2019 more than I\u2019ve watched \u2018Citizen Kane.\u2019 \u201d \u2666 Premier League chief executive defends new \u00a35.14bn dealDismisses criticism of how windfall will be distributedScudamore: \u2018It is up to politicians to raise minimum wage\u2019\n\nRichard Scudamore has dismissed criticism of how the Premier League redistributes its income from broadcasters \u2013 insisting it is not clubs\u2019 responsibility to pay stadium staff the living wage.\n\nThe Premier League\u2019s chief executive, speaking after the announcement of a record \u00a35.14bn windfall from BT and Sky, said clubs would make individual decisions on how to use the money, and he could make no guarantees over the scale of redistribution or on reducing ticket prices.\n\nAsked on BBC Radio 4\u2019s Today programme whether clubs should react to the 70% rise in income by increasing the wages of their lowest-paid employees \u2013 with Chelsea the only Premier League club to commit to paying the living wage \u2013 Scudamore said: \u201cAt the end"} +{"output_text": "UNKNOWN\u2019S BATTLEGROUNDS has announced that they will be postponing their upcoming season until 2021. The announcement was made on the official PLAYERUNKNOWN\u2019S BATTLEGROUNDS website.\n\nUPDATED 6/9: The Overwatch League has announced that they will be postponing their season until 2021. The announcement was made on the official Overwatch League website.\n\nUPDATED 6/8: The Overwatch League has announced that they", "input_text": ", but announced on Monday, April 13 that the team expects to start playing OWL matches again in early May.\n\nUPDATED 3/27: Sixteen games of the 2020 Overwatch League were scheduled to take place online this weekend. However, both the New York Excelsior and the Vancouver Titans have stepped back from competing temporarily. The games have yet to be re-scheduled.\n\nUPDATED 3/11: All Overwatch League events have been canceled through April. However, OWL Commissioner, Pete Vlastelica, tweeted out that all games will still be played. Only the events themselves have been canceled. Please continue checking back for more information as the story updates.\n\nUPDATED 3/17: Overwatch League announced their new schedule, with games beginning March 21st. They also announced they'd be playing games entirely online, and across multiple different regions, eliminating the need for travel.\n\nUPDATED 3/24: In an effort to make up for scheduled games canceled by the COVID-19 outbreak, Overwatch League will run sixteen games this weekend. Eight games will take place on Saturday, March 28, and then eight more will take place throughout Sunday, March 29.\n\nOther Esports\n\nUPDATED 9/9: Counter-Strike Global Offensive Major ESL One Rio 2020 has been canceled due to COVID-19.\n\nUPDATED 6/19: Twitch announced it would cancel their TwitchCon San Diego event, set to take place this Fall due to continued concerns around COVID-19. TwitchCon Amsterdam was already canceled this March, meaning there will be no TwitchCons in 2020. Twitch is looking for other potential options to celebrate in an event that would still fit in health safety regulations. Stay tuned for more details.\n\nUPDATED 6/10: PLAYER"} +{"output_text": " each of those seasons.\n\nMayfield has thrown at least 67 percent of his passes in five of the Browns' six games this season. He has thrown at least 67 percent of his passes in four of the six games since Kitchens took over as offensive coordinator.\n\n\"I think it's a credit to Freddie and the offensive staff,\" Mayfield said. \"They're doing a great job of getting the ball out of my hands and getting me in a position to make plays.\"\n", "input_text": "Current firefighters swear they don't sneak into the firehouse at night to change the bulb.\n\n\"We're the Fire Department. We're very trustworthy,\" said firefighter Roy Andora. \"You trust us with your lives. You can trust us with a bulb.\"\n\nFans of the bulb have no doubt of the bulb's power. They watch it on a 24-hour webcam at www.centennialbulb.org. They know it's real.\n\n\"I equate it with a pet rock or a hula hoop,\" said Dick Jones, a retired Sandia engineer who specializes in photographing the bulb. \"It's just one of those things that people seem to like.\"\n\nLivermore is planning a party June 18 for the bulb, assuming no one accidentally smashes it before then. BEREA, Ohio -- Baker Mayfield has entered uncharted territory for Cleveland Browns quarterbacks.\n\nSince Gregg Williams replaced Hue Jackson as head coach and Freddie Kitchens replaced Toddy Haley as offensive coordinator, Mayfield has had five games in which he completed at least 67 percent of his passes. No other Browns quarterback has accomplished that feat since 1999, the year the Browns returned from a three-year hiatus, according to ProFootballReference.com. Mayfield is also one of only 42 NFL quarterbacks to do that since 1999.\n\nBest of NFL Nation \u2022 Jones vs. Kamara a big SNF matchup\n\n\u2022 Urschel goes from NFL to MIT\n\n\u2022 Belichick: Waller will be a challenge\n\n\u2022 Bolder Kyler Murray still humble\n\n\u2022 McCarthy might have Cowboys moment\n\nThe most recent Browns quarterback to do this was Milt Plum in the 1959 and 1960 seasons, according to ProFootballReference.com. But Plum threw 14 or fewer times in"} +{"output_text": "nant Church of Jesus Christ of Latter-day Saints gather for a devotional at the Denver Snuffer Center in Denver, Saturday, April 21, 2019.\n\nThe church\u2019s move toward greater transparency has been a reaction to the increasing number of people who have left the church. The church has been more open about its history and doctrine in the past 10 years than it has been in the past 100 years.\n\nThe church has been more open about its history and doctrine in the past", "input_text": " [edict on same-sex couples] as prophetic revelation, a painful and shocking policy became even harder to bear for LGBTQ Mormons and their loved ones. Even as the prophetic mantle passes to him, President Nelson may remain the symbol of ecclesiastical exclusion for LGBTQ Latter-day Saints far into the future.\n\nIt is my hope that, in the next 10 years, the engagement of leadership and the rank-and-file membership will result in forward motion as we work to make a place for everyone at church.\n\nPhoto courtesy of the LDS Church Joseph Smith's First Vision.\n\nEra of doubt\n\nEmily Jensen \u2022 In the 10 years since 2008, there are so many more members who have questions about church policies, doctrines and history and, while not willing to out themselves at church, they need outlets online to discuss if there is a place for them at church.\n\nI do think the church has become more open with its history and that will now never not be the case. I don\u2019t think the church could do another Joseph Smith [curriculum] manual, for instance, without it being compared to the Joseph Smith Papers.\n\nPatrick Mason, head of Mormon studies at Claremont Graduate University \u2022 In the long run, the church\u2019s move toward greater transparency will strengthen the institution by giving those members who remain greater maturity in their faith. The church and its teachings will be more confident and resilient, less defensive and brittle. In the short term, of course, that transparency has been both a reaction to and in some cases a cause of disaffection. No one knows the exact numbers, of course; I believe there are far more people who struggle with their faith but remain inside the church than those who leave.\n\n(Al Hartmann | The Salt Lake Tribune) Members of Denver Snuffer's Rem"} +{"output_text": " with Ayr United.\n\nThe cheapest adult and junior shirts are both from Ayr United, with the cheapest adult shirt costing \u00a339.99 and the cheapest junior shirt costing \u00a329.99.\n\nThe cheapest adult and junior shirts are both from Ayr United, with the cheapest adult shirt costing \u00a339.99 and the cheapest junior shirt costing \u00a329.99.\n\nThe cheapest adult and junior shirts are both from Ayr United, with the cheapest adult shirt costing \u00a339", "input_text": " tickets, resulting in 44 of 60 ticket price categories staying the same or coming down this year.\n\nHowever, there was a 12% rise in the average cheapest season ticket across the league - from \u00a3182.20 to \u00a3204 - with six sides offering a ticket costing \u00a3200 or more, compared to one last season.\n\nThis is explained by price rises at Arbroath (\u00a3180 to \u00a3200), East Fife (\u00a3162 to \u00a3220) and Queen's Park (\u00a3160 to \u00a3200), while relegated sides Ayr United (\u00a3220) and Raith Rovers (\u00a3230) are both above league average, although they have frozen and reduced their cheapest season ticket respectively.\n\nOther key findings in League One:\n\nAlbion Rovers and Forfar Athletic offer the cheapest season ticket at \u00a3180, Raith Rovers sell the most expensive at \u00a3270.\n\nForfar Athletic offer a cup of tea for 85p but a 5p rise means they are no longer the cheapest in the entire study - Oxford United Women offering a brew for 80p.\n\nAyr United sell the most expensive adult and junior shirts at \u00a349.99 and \u00a339.99 respectively - dearer than all Premiership sides except Aberdeen, Celtic and Rangers.\n\nLeague Two is the only Scottish division to see a fall in the average cheapest home matchday ticket - from \u00a312.30 to \u00a311.70 - while there is a 8.9% fall in the average price of a cheapest away ticket, from \u00a312.30 to \u00a311.20 - and 50 of 60 ticket price categories are frozen or down.\n\nEdinburgh City sell the cheapest pie in the entire Price of Football study, at just \u00a31.\n\nAt \u00a36, they also offer the joint-lowest single home ticket in men's football across the United Kingdom, together"} +{"output_text": " weeks later, but he is ineffective and is benched for the final two games of the season.\n\nOctober 23, 2011: The 49ers sign free agent wide receiver Michael Crabtree to a five-year, $42.5 million contract. Crabtree, who was the No. 2 overall pick in the 2009 draft, will be the team\u2019s top wide receiver in 2012.\n\nOctober 24, 2011: The 49ers sign free agent wide receiver Randy Moss to a", "input_text": "er David Akers to a contract. Akers will lead the league in field goals (44) during his first season at the helm.\n\nAugust 3, 2011: The 49ers sign cornerback Carlos Rogers to a one-year deal. Rogers will end up making the Pro Bowl in 2011 before signing a long-term deal with the team in 2012.\n\nAugust 4, 2011: Despite tweeting that he would sign with the Bengals, who even announced the signing on their team website, Donte Whitner signs a three-year contract with the 49ers.\n\nAugust 21, 2011: The Cardinals re-sign Fitzgerald to an eight-year, $120 million deal that guarantees their star wide receiver nearly $50 million.\n\nSeptember 11, 2011: During the opening week of the 2011 season, the Rams lose slot wideout Danny Amendola to a dislocated elbow and starting cornerback Ronald Bartell to a broken neck. Both players miss the remainder of the season; Bartell becomes the first of three starting cornerbacks to suffer season-ending injuries in St. Louis.\n\nOctober 12, 2011: Aaron Curry, 2009\u2019s fourth overall pick, is traded from Seattle to Oakland for a 2012 seventh-round pick and a conditional pick in the 2013 draft. The first pick in question is used on North Carolina State defensive end J.R. Sweezy, who converts to guard after being drafted by the Seahawks. Sweezy eventually becomes Seattle\u2019s starter at right guard. The second pick is used on cornerback Tharold Simon in the fifth round.\n\nOctober 16, 2011: Bradford suffers a high ankle sprain in a loss to the Packers. The injury adds to hip, toe, and finger issues that have slowed Bradford during St. Louis\u2019s 0-5 start to the season. The team brings Bradford back into the lineup three"} +{"output_text": "rected techs of RSI in 1944.12 scenario. - Added missing tech to list of known techs of RSI in 1944.12 scenario. - Added missing tech to list of known techs of RSI in 1944.12 scenario. - Added missing tech to list of known techs of RSI in 1944.12 scenario. - Added missing tech to list of known techs of RSI in 1944.12 scenario. - Added missing tech to list of known techs of R", "input_text": " handling of German leaders in SPA. - Added claims to England in events \"England surrenders\". - Changed \"Churchill becomes PM\" event pic (instead of generic). - Capital nuked - Aftermath events now check if the new capital has been not nuked already. - Corrected date and effect of POL event 2013005 (Wladyslaw Sikorski passes away) + added new event 2001096 (ENG Wladyslaw Sikorski and Tadeusz Klimecki die in Gibraltar B-24 crash). - Fading Sun event now takes into consideration a Korea already liberated by Chinese. - Reworked triggers and effects of OTT surrender events. - Waked all U08 slept leaders and ministers in event #2191583 (The military mission to the Ottoman Empire returns home). AI Changes Darkest Hour Full - Changed build priorities of BUL, HUN, ROM (air -7%, land +7%). - Japanese AI now properly garrisons Shanghai province before the war with China. - Removed Suez as invasion target in various AI files, added Port Said instead. - AI JAP should acquire military control over SIA and French Indochina to ensure that JAP AI will be the front leader in SE Asia (and won't release China-Nanjing before Chinese surrender). - Fixed invalid Alexandria province ID (790 instead of 789) in many ENG AI files. Scenarios Changes Darkest Hour Full - Corrected SOV Chief of Air Force in 1939-1945 scenarios. - Corrected land doctrines blueprints of Germany in 1933 scenario. - Added missing tech to list of known techs of RSI in 1944.12 scenario. - Adding ship assembly line to USA's starting techs in 1941 scenario. - Corrected unit names for Hungary in 1933-1945 scenarios. - Cor"} +{"output_text": " longer a problem.\n\nThe Guardian\u2019s experiment is not unique. The New York Times, which has a paywall, has also experimented with a \u201cfreemium\u201d model, in which readers can access some articles for free, but must pay for others. The Washington Post, which has a paywall, has experimented with a \u201cfreemium\u201d model, in which readers can access some articles for free, but must pay for others. The New York Times, which has", "input_text": " in 1473, his artistic outlays were to \u201cserve the glory of God, the honour of the city, and the commemoration of myself\u201d. Such men wanted to signal status, and they wanted to do penance (something which, if done conspicuously, served the former purpose too). No palazzo in Venice was complete without a portrait of the Virgin in each room.\n\nSome modern consumers feel driven by a secular version of the same motivation. Stian Westlake, one of the authors of \u201cCapitalism Without Capital\u201d, which looks at the consequences of the world\u2019s increasingly intangible economy, argues that people are decent intuitive economists; they know when something is unsustainable. And they know which public goods governments will and won\u2019t provide. \u201cMy take is that we\u2019re massively overpaid in tech, and I need to balance the books if the state won\u2019t,\u201d says one developer who supports eight people, mostly left-wing activists, on Patreon.\n\nGiven such enlightenment on a large-enough scale, patronage can fill the gap left by market failure and government reluctance. Possibly the clearest example of this is the Guardian, a British newspaper which refused to put up a paywall. In 2016, about 150m browsers visited its free website each month; subscribers numbered fewer than 200,000. Instead, since the summer of 2016, a little notice has appeared on the bottom of many stories: \u201cSince you\u2019re here,\u201d it begins, \u201cwe have a small favour to ask\u201d. It is a begging letter, asking Guardian readers for a few pounds per month. \u201cIt was a bit of a punt,\u201d says one person involved. A successful one: today 340,000 people donate monthly. Another 375,000 gave one-off donations in 2017. If people are willing to pay and let others benefit, free-riding is no"} +{"output_text": ", we believe that it is important to provide a comprehensive overview of the field of WebAL, and to identify the key areas of research that are currently being pursued. We also believe that it is important to identify the key challenges that the field faces, and to discuss the ways in which these challenges might be addressed in the future. In this review, we provide a brief overview of the field of WebAL, and then proceed to discuss the key areas of research that are currently being pursued. We then", "input_text": " in this exciting area.\n\n\u00a9 2016 Massachusetts Institute of Technology. Published under a Creative Commons Attribution 3.0 Unported (CC BY) license.\n\n1 Introduction Section: Choose Top of page Abstract 1 Introduction << 2 Review Scope and Method... 3 The Precursors to WebAL 4 The 1990s: Early WebAL 5 The 2000s: WebAL Develo... 6 The 2010s: Current WebA... 7 Emerging Themes and Fut... 8 Conclusion Acknowledgments Notes References CITING ARTICLES In recent years there has been a growing body of work in artificial life (ALife) that makes use of web technology in one way or another. Over the last five years or so, web technologies have shifted away from proprietary browser plug-ins and towards standardized, native application programming interfaces (APIs) for providing graphics, animation, multimedia, and other advanced features. This progress is due to the development and adoption of the HTML5 language and associated API specifications introduced by the World Wide Web Consortium (W3C).1 This movement has made it much easier to develop and deploy rich web-based applications that work reliably and consistently on any browser, across multiple platforms and devices. It is therefore unsurprising that a number of high-profile ALife projects have emerged over this period that utilize web technology in various different ways. We refer to such work as WebAL, and broadly construe the field to include the multitude of ways in which ALife and the web might intersect. Examples include the creation of massively distributed user-guided evolutionary systems, the creation of open science platforms, the use of web-based applications for public outreach and engagement, and the use of crowdfunding platforms for supporting the development of ALife systems. As we demonstrate in this review and summarize in Section 7, there are many other points of intersection in addition to these. In light of this"} +{"output_text": " might think.\n\n\u201cThe gap between Democrats and Republicans is not as large as people might think,\u201d says Krosnick. \u201cThe majority of Democrats are in favor of climate action, and the majority of Republicans are not.\u201d\n\nThe poll also found that Americans are more likely to support climate action if they believe it will help the economy.\n\n\u201cThe majority of Americans who believe climate change will hurt the economy are not willing to pay more for energy,\u201d says Krosnick. \u201c", "input_text": ".\n\nRosenwasser pointed to an article Buffington used as basis for his opinion, which said the drug most frequently produces euphoria and empathy.\n\n\"I'm not saying those don't happen. I'm saying that there's no predictability that that's going to be the outcome you achieve,\" Buffington said. Share this\n\nArticle Facebook\n\nTwitter\n\nEmail You are free to share this article under the Attribution 4.0 International license. University Stanford University\n\nWhile the United States is deeply divided on many issues, there is remarkable consensus on climate change, according to new research.\n\n\u201cBut the American people are vastly underestimating how green the country wants to be,\u201d says Jon Krosnick, a professor of communication and of political science at Stanford University, about new findings from a poll he led on American attitudes about climate change.\n\n\u201cThe majority doesn\u2019t realize how many people agree with them\u2026\u201d\n\nResearchers conducted the study with ABC News and Resources for the Future, a Washington, DC-based research organization. They polled a representative sample of 1,000 American adults nationwide from May 7 to June 11, 2018. The margin of error is +/- 3.5 percentage points.\n\nThe poll showed that Americans don\u2019t realize how much they agree about global warming: Despite 74 percent of Americans believing the world\u2019s temperature has been rising, respondents wrongly guessed 57 percent.\n\n\u201cThe majority doesn\u2019t realize how many people agree with them,\u201d says Krosnick. \u201cAnd this may have important implications for politics: If people knew how prevalent green views are in the country, they might be more inclined to demand more government action on the issue.\u201d\n\nBreaking the numbers down along party lines, although Republicans and Democrats differ on the issue, the poll revealed that the gap is not as large as people"} +{"output_text": ".\"\n\n\"I'm not uneducated,\" he says. \"I'm educated. I'm a coal miner.\"\n\nThe film's most moving moments come when the Dreamers are interviewed.\n\n\"I'm not a criminal,\" says one, a young man who came to the U.S. from Mexico as a child. \"I'm a human being.\"\n\n\"I'm not a criminal,\" says another, a young woman who came to the U.S.", "input_text": " talk economics, Clinton backers talk gender and accomplishments, and those standing outside the system deliver some of the film's most cogent arguments. They include Vermont organic farmer Boots Wardinski, a Liberty Union Party candidate for lieutenant governor who notes that being truly committed to strong principles is the antidote to getting elected. The diehard environmentalist received 2 percent of the state's vote, and viewed the presidential race as immaterial.\n\nAnother outsider, a self-described \"houseless\" Honolulu man, living in a tent, can't vote because of his prison record. His ignorance of the presidential campaigns and wall-to-wall media coverage looks like a kind of bliss, heightened by the island setting.\n\nBut issues are well past the point of arguing \u2014 at least in terms of voter turnout \u2014 during the hours captured by the filmmakers. Journalists' reluctant discernment of the ballot-box upset is its own subject. At Philadelphia public radio station WHYY and in the newsroom of the Los Angeles Times, journalists who'd drunk the same pollster Kool-Aid as most Americans prepare to report on the election of the country's first female president. They meet the surprising results with no blatant emotion, calmly changing gear \u2014 and headlines.\n\nElsewhere the film tracks shock, elation, despair. In San Jose, community organizer Jesus, himself a Dreamer whose citizenship is on the line under a Trump administration, tries to soothe neighbors' deportation fears. A pall settles over New York City's Javits Center, where Clinton supporters have gathered, expecting to celebrate.\n\nThe film is sensitive to the us-vs.-them factor. Watching news coverage of the election results, West Virginia miner Eric Hayhurst bristles at news anchors' use of the phrase \"rural areas,\" interpreting it as code for \"uneducated"} +{"output_text": " deep crosser by the X receiver.\n\nThe corner route is a simple concept. The tight end is going to run a post route, and the corner route is going to be the first read.\n\nThe corner route is a simple concept. The tight end is going to run a post route, and the corner route is going to be the first read. The corner route is a simple concept. The tight end is going to run a post route, and the corner route is going to", "input_text": " a great example of this. The Z receiver goes deep and the two interior receivers work to get open underneath.\n\nA lot of coaches are split on the topic of tendencies, and staying predictable. Still, everyone agrees that you need the ability to keep your opponent off-balance, even if your scheme doesn't feature a ton of bells and whistles.\n\nSo how do you get the football to your widest receiver in the formation, the guy furthest away to the strong side?\n\nAll kinds of ways, really, but in this post we're gonna be looking at one play call I thought was very creative.\n\nThe Play\n\nThe Packers receivers have a very good understanding of where they are on the field, how they fit in with the other pieces in the offense, and when the football is supposed to come out of the quarterback's hands.\n\nThis concept is designed with all that in mind, as well as the particular style of defense they're expecting to see.\n\nWhen you're playing a team who specializes in zone coverage, or maybe you're just facing a long yardage situation, it's not enough anymore to simply drop back, invite the pass rush up the field, and then dump it off to the tailback behind a wall of blockers.\n\nDefenses have gotten smarter, and even the ones who haven't have still gotten faster. It doesn't matter how well you fool a guy with that perfectly-timed screen pass if he has the speed to chase you down from behind.\n\nSo what's to be done? Well, you've gotta find other ways to get the football to your speedsters with blockers in front, and Green Bay has done just that.\n\nLook at the diagram below.\n\nYou've got a corner route by the tight end, a very shallow slant by the Z receiver, and the"} +{"output_text": " and the terrain. It is a maximum security prison, where the inmates are kept in solitary confinement for 23 hours a day.\n\nThe inmates are not allowed to have books, newspapers, magazines, or any other reading material. They are not allowed to have a radio or a television. They are not allowed to have a telephone. They are not allowed to have a pen or pencil. They are not allowed to have a toothbrush. They are not allowed to have a razor. They are not", "input_text": " do?\u201d\n\n\u201cThey\u2019re all scared,\u201d said Marie, beginning to yawn. \u201cAnyway, that\u2019s communism or something.\u201d\n\n\u201cOK. So what? Like in prison. A guy would get to talking like I am and some guy would yell: \u2018Communist!\u2019 and it would shut him up. But that don\u2019t scare me. Call it what you like. It\u2019s still good sense.\u201d\n\nA film made (or novel written) today about the New York state manhunt would likely adopt the point of view of the \u201cheroes\u201d who shot Matt three times in the head or the \u201ccourageous\u201d state trooper who fired twice at an unarmed, fleeing Sweat and struck him in the back.\n\n\u201cThe nightmare is finally over.\u201d\n\nThe daring, complicated escape by Matt and Sweat generated a massive manhunt by 1,300 officers from a dozen police agencies. A $75,000 bounty was put on each man\u2019s head by the government.\n\nThe escape also set off a media frenzy, hysterical, bloodthirsty and vindictive. To read the newspapers, including the \u201cliberal\u201d New York Times and the gutter right-wing New York Post, or listen to the television news, one would have thought that Matt and Sweat were master criminals.\n\nIn our estimation, the media in their coverage of the manhunt, egging on the dogs, helicopters and heavily armed officers, exhibited far more brutality than the poor wretches on the run. There was no need to kill Matt or Sweat, except to make an example of them, and to satisfy the sadism and ruthlessness of the authorities.\n\nDannemora (as the institution is known) is a brutal prison, referred to as Little Siberia because of the climate"} +{"output_text": "\n\nHacks kernel & writes a horde of scripts to prevent folk from ever using more than their fair share of system resources. ADMINISTRATIVE FASCIST: Hacks kernel & writes a horde of scripts to prevent folk from ever using more than their fair share of system resources.\n\nHacks kernel & writes a horde of scripts to prevent folk from ever using more than their fair share of system resources. MANIAC: Richard Stallman.\n\n", "input_text": " who say \u201cvirtual reality.\u201d Drinks damn near anything, but favors Wild Turkey, Black Bush, and grain alcohol. Hangs out in neighborhood bars and tries to pick up MOTAS by drinking longshoremen under the table.\n\nReads Utne Reader and Mein Kampf. Faithfully attends Dickies and Ramones concerts. Punches out people who say \u201cvirtual reality.\u201d Drinks damn near anything, but favors Wild Turkey, Black Bush, and grain alcohol. Hangs out in neighborhood bars and tries to pick up MOTAS by drinking longshoremen under the table. IDIOT: Reads Time and Newsweek\u2014and believes them. Drinks Jagermeister. Tries to pick up close blood relations\u2014often succeeds, producing next generation of idiots.\n\n1992 Presidential Election:\n\nTECHNICAL THUG: Clinton, but only because he liked Gore's book.\n\nClinton, but only because he liked Gore's book. ADMINISTRATIVE FASCIST: Bush. Possibly Clinton, but only because he liked Tipper.\n\nBush. Possibly Clinton, but only because he liked Tipper. MANIAC: Frank Zappa.\n\nFrank Zappa. IDIOT: Perot.\n\n1996 Presidential Election:\n\nTECHNICAL THUG: Richard Stallman \u2014 Larry Wall.\n\nRichard Stallman \u2014 Larry Wall. ADMINISTRATIVE FASCIST: Nixon \u2014 Buchanan.\n\nNixon \u2014 Buchanan. MANIAC: Frank Zappa.\n\nFrank Zappa. IDIOT: Quayle.\n\nCompound System Administrators:\n\nTECHNICAL FASCIST: Hacks kernel & writes a horde of scripts to prevent folk from ever using more than their fair share of system resources."} +{"output_text": " skepticism about the plan, which would allow the state to import drugs from Canada, a country with a lower cost of production than the United States.\n\nAzar said the plan would be \u201ca very big deal\u201d for the U.S. and that he would \u201cwant to see the details.\u201d\n\nThe plan is a big deal because it would allow Florida to import drugs from Canada, which would lower the cost of prescription drugs for Floridians.\n\nThe plan is a big", "input_text": " to 30 years, I think we\u2019ll get kids saying, \u201cMummy, Daddy, I can\u2019t believe you used to eat animals. Why!?\u201d It\u2019s only a matter of time. In 20 years, I think you\u2019ll get countries declaring themselves vegetarian.\n\n20% of 16- to 25-year-olds are now vegetarian, compared with 12% of the country. This generation really gets it.\n\nThat is so hard to imagine.\n\nMaybe, but only in the past century has society really accepted that black people deserve all the same rights white people do, and that women deserve the rights men do. One philosopher [Arthur Schopenhauer] said that every truth passes through three stages. First, it is ridiculed. Then, it\u2019s violently opposed. Then it\u2019s accepted as truth. Now we\u2019re at the ridicule stage. People still make jokes about vegans. Next step, I imagine you\u2019ll get some violent opposition.\n\nSo how do kids who want to go vegetarian or vegan respond to that ridicule?\n\nLaugh it off, man. Lions don\u2019t concern themselves with the opinions of sheep. This is about caring about the environment. This is the empathetic side. If you\u2019re against empathy, then you\u2019re on the wrong side. President Trump Donald John TrumpBiden on Trump's refusal to commit to peaceful transfer of power: 'What country are we in?' Romney: 'Unthinkable and unacceptable' to not commit to peaceful transition of power Two Louisville police officers shot amid Breonna Taylor grand jury protests MORE is breaking with fellow Republicans by opening the door to approving Florida\u2019s proposal to import prescription drugs from Canada, a potentially major shake-up of how Americans pay for drugs.\n\nTrump\u2019s own secretary of Health and Human Services, Alex Azar, expressed"} +{"output_text": " -(void)showPicker;\n\nFinally, add the following method to the class continuation above the @implementation line:\n\n-(void)viewDidLoad { [super viewDidLoad]; // Do any additional setup after loading the view from its nib. //[self hidePicker]; //[self showPicker]; }\n\nNow, in the viewDidLoad method, add the following code to hide the date picker:\n\n[self hidePicker];\n\nAnd add the following", "input_text": "h, add the following import statements below the existing #import line:\n\n#import \"FailedBankInfo.h\" #import \"FailedBankDetails.h\"\n\nThen add the following properties and methods before the final @end:\n\n@property (nonatomic, strong) FailedBankInfo *bankInfo; @property (nonatomic, weak) IBOutlet UITextField *nameField; @property (nonatomic, weak) IBOutlet UITextField *cityField; @property (nonatomic, weak) IBOutlet UITextField *zipField; @property (nonatomic, weak) IBOutlet UITextField *stateField; @property (nonatomic, weak) IBOutlet UILabel *tagsLabel; @property (nonatomic, weak) IBOutlet UILabel *dateLabel; @property (nonatomic, weak) IBOutlet UIDatePicker *datePicker; -(id)initWithBankInfo:(FailedBankInfo *) info;\n\nNext, in SMBankDetailViewController.xib, hook up each outlet you defined with its corresponding component. You can do this by selecting the File\u2019s Owner (which is the SMBankDetailViewController class), switching to the Connections Inspector in the right sidebar, and dragging from each outlet to the relevant control on the view.\n\nAt the top of SMBankDetailViewController.m, right below the @implementation line, synthesize all the properties as follows:\n\n@synthesize bankInfo = _bankInfo; @synthesize nameField; @synthesize cityField; @synthesize zipField; @synthesize stateField; @synthesize tagsLabel; @synthesize dateLabel; @synthesize datePicker;\n\nAlso add two private methods to the class continuation above the @implementation line, as follows:\n\n@interface SMBankDetailViewController () -(void)hidePicker;"} +{"output_text": ", calling it a \u201cdangerous\u201d step that would \u201cundermine the peace process.\u201d\n\n\u201cThe US decision to move its embassy to Jerusalem is a dangerous step that undermines the peace process and the two-state solution,\u201d Iranian Foreign Ministry Spokesman Bahram Qassemi said on Saturday.\n\n\u201cThe US decision to move its embassy to Jerusalem is a dangerous step that undermines the peace process and the two-state solution,\u201d Iranian Foreign Ministry Spokesman Bahram", "input_text": "\u2013all for the sake of fitting in. It got so bad that one night I actually had to walk home from the bars in the middle of the night (about a 1-2 mile walk) in high heels and a dress because I couldn\u2019t afford the taxi ride home.\n\nMy only excuse for this whole debacle is that I was living in a big city 3 hours from my hometown and had virtually no friends. I was desperate to find people to hang out with, and I was desperate for those people not to find out that I was barely paying my bills. They were either older and/or more well off than I was, and I simply could not keep up with their lifestyle. (See, what I learned from being broke)\n\nEventually, I had a falling out with one of the group members and the stress of keeping up the ruse became too much for me. I slowly phased myself out of the group. At the time, I was pretty bummed about it because I had to go back to having barely any friends. But looking back, I was certainly better off. I\u2019m not sure why I wanted to hang out with a bunch of 35-year-olds who were still spending all their time in college bars, anyway.\n\nConclusion\n\nClearly, I\u2019ve done a lot of dumb things that have cost me money. I am absolutely certain that this list doesn\u2019t even cover half of the way I so waste much money, but it\u2019s a good start for now.\n\nThe good thing is, once you realize you are doing dumb things, you can stop doing them. A little self-awareness goes a long way with your checkbook.\n\nThis article was originally published on Your Money Geek. Iran has strongly condemned US President Donald Trump\u2019s decision to move his country\u2019s embassy in Israel to Jerusalem"} +{"output_text": " Rakhine State are understandable.\n\nThe Rohingya insurgency is not a new thing. It has been going on intermittently since 1947.\n\nThe Rohingya insurgency is not a new thing. It has been going on intermittently since 1947. The Rohingya insurgency is not a new thing. It has been going on intermittently since 1947. The Rohingya insurgency is not a new thing. It has been going on intermittently since 1947. The", "input_text": "In May 1946 Rohingya leaders met with Mohammed Ali Jinnah, the Muslim leader who founded modern Pakistan, and asked that part of Rakhine state be annexed by East Pakistan. Then, when Jinnah refused to interfere in Burmese matters, they founded the Mujahid Party in in northern Arakan in 1947. The aim of the Mujahid party was initially to create an autonomous Muslim state in Arakan. The local mujahideen \u2013 that\u2019s what the Rohingya warriors proudly called themselves \u2014 fought government forces in an attempt to have the mostly Rohingya-populated Mayu peninsula in northern Rakhine State secede from Myanmar (then Burma), and after that secession, the Rohingyas hoped that territory would be annexed by East Pakistan (present-day Bangladesh). Fighting between the Rohingya and the Burmese state, then, is not a new thing; it has been going on intermittently since 1947. The Rohingya revolt eventually lost momentum in the late 1950s and early 1960s, and many of the Rohingya surrendered to government forces.\n\nBut the Muslim insurrection by the Rohingya did not disappear. It was revived in the 1970s, which in turn led to the Burmese government mounting, in 1978, a huge military operation (Operation King Dragon) that inflicted great damage on the mujahideen, and bought a decade of relative calm. But again the Rohingya rose up against the Burmese state, and in the 1990s the \u201cRohingya Solidarity Organisation\u201d attacked Burmese authorities near the border with Bangladesh. In other words, this insurgency by the Muslim Rohingya has been going on \u2013 waxing and waning \u2013 for more than half a century. It is in that context that Buddhist fears of a Muslim takeover of"} +{"output_text": " friendly manner. Don\u2019t be rude. Be polite. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be humble. Be", "input_text": "iscuss their passions and hobbies. Theodore Roosevelt was a master in this area. He knew a broad range of subjects. Whenever he had to meet an extraordinary person, he\u2019d study his interests. This habit allowed him to woo people.\n\nThe Sixth Rule\n\nFind a suitable method to makes people feel valued. For example, ask yourself what traits in others you admire. William James, a psychologist, says, \u201cthe deepest desire of human beings is to get appreciation.\u201d\n\nBy praising others, you help add to their sense of importance. But, you must be genuine in your gratitude. Compliments must not appear to be fake flattery.\n\n\u201cHow to Win People to Your Way of Thinking\u201d\n\nAdopt 12 strategies to convince people to trust what you\u2019re telling them. Try to use every technique in your conversations consciously:\n\nThe only way to win an argument is to avoid it \u2013 Disagreements tend to make people defensive. Besides, someone who thinks he/she has lost an argument loses face. Once you become part of a debate, you can never win. It\u2019s because if you lose, then you lose. And, even if you win, you lose. Hence, avoid arguments. Show respect for other people\u2019s opinions \u2013 Don\u2019t make people feel you disagree by using careless words or looks. When you challenge their views, you compel them to attack. This doesn\u2019t change their mind at all. Admit when you are wrong \u2013 If you err, quickly accept it. Such admission is useful when you know others think you\u2019re wrong. It\u2019s better to hear self-criticism. Besides, when you accept a mistake, others are likely to be more forgiving. But, if you don\u2019t accept, they\u2019ll be more critical. Even if you are angry, begin in a"} +{"output_text": " scholarship. It cites a \u201cHizballah-affiliated\u201d mosque in Queens as a \u201cpossible Hizballah front\u201d and a \u201cHamas-affiliated\u201d mosque in Brooklyn as a \u201cpossible Hamas front.\u201d It also cites a \u201cpossible Hizballah front\u201d in the Bronx, and a \u201cpossible Hamas front\u201d in the Bronx. It\u2019s not clear how the NYPD arrived at these conclusions, but it\u2019s hard to imagine that the department\u2019s analysts", "input_text": " employed undercover cops, informants and, significantly, \u201cunprecedented help\u201d from the CIA, which is technically barred from spying on Americans. A taste of the NYPD\u2019s activities:\n\nIn just two episodes showing how widely the NYPD cast its net, the department sought a rundown from the taxi commission of every Pakistani cab driver in the city, and produced an analytical report on every mosque within 100 miles, officials said.\n\nSome Muslims, particularly cab drivers, complain of being subjected to \u201cvoluntary interrogations\u201d by police, encounters that, while legal, prove intimidating and alienating.\n\nThe latest report disclosed by the AP, dubbed \u201cU.S.-Iran Conflict: The Threat to New York City,\u201d was authored by the NYPD half a decade ago. But, given the current climate of tensions and diplomatic pressure surrounding the fraught U.S.-Iranian relationship, the concerns that it expresses seem as relevant now as they were then. It airs suspicions \u2014 albeit largely unfounded ones \u2014 over the influence of Tehran-backed militant groups like Hizballah and Hamas in pockets of New York\u2019s Shi\u2019ite community. Unlike most other countries in the Middle East, Iran is majority Shi\u2019ite, and its theocratic government styles itself at the revolutionary vanguard of a form of messianic Shi\u2019ism. But that doesn\u2019t justify such a full-scale sweep of Shi\u2019ite mosques and neighborhood centers, the vast majority of which were frequented by those not of Iranian descent. Though the NYPD strenuously denies carrying out such operations in the city solely on the basis of a community\u2019s religious denomination, it seems hard to arrive at any other conclusion.\n\n(Photos: 30 Mosques in 30 Days \u2014 An American Trip)\n\nStill, the report betrays an almost amusing level of shoddy"} +{"output_text": "us Project (Link to join Nxtchat en Slack)\n\nJanus Project (Link to join Nxtchat en Slack)\n\nJanus Project (Link to join Nxtchat en Slack)\n\nJanus Project (Link to join Nxtchat en Slack)\n\nJanus Project (Link to join Nxtchat en Slack)\n\nJanus Project (Link to join Nxtchat en Slack)\n\nJanus Project (Link", "input_text": " automated process based on the success of their calls, and potential followers will be able to see these scores and current followers recommendations prior to signing up for each signal callers private rooms. Our team has some experience in the development of forex trading bots. We plan to create an in browser gui to configure various trading strategies and put traders in the position to backtest these using a multithreaded approach which will dramatically shorten the time required to backtest trading strategies when compared to packages such as MT4, etc. This will be marked as part of a professional package within the social fintech site and carry a higher price but should prove extremely valuable for end users. There are more options that will be added over time until our business becomes the primary source for all interested parties to convene. We will of course run heavy marketing and gain user feedback, then apply suggestions when appropriate to further make the entire user experience amazing for all our users. We have achieved great milestones so far, yet the future for us as a team and company holds this and much more. We invite you to our multiplatform, and diverse industry tokens, community, and team. \u2013 Team Janus\n\nIn addition to this, Janus has also announced on Twitter that Bitsquare, the first decentralized exchange to buy and sell bitcoins, in addition to other cryptocurrencies, has added JNS / BTC trading pair. It\u2019s important to note that in addition to Bitsquare, users have the ability to trade JNS assets on the Nxt Asset Exchange with the NXT / JNS pair. Users can also trade JNS on Stocks.Exchange, with trading pairs for BTC and NXT enabled as well as with Bitcoin or fiat currencies on C-CEX..\n\nMore Information:\n\n#janusproject (Link to join Nxtchat en Slack)\n\nNews compilation about Jan"} +{"output_text": "on camera): This is the most important base in South Korea. It's the home of the U.S. Air Force's Fifth Air Force. It's the home of the U.S. Air Force's Eighth Air Force. It's the home of the U.S. Air Force's Air Combat Command.\n\nRADDATZ (voice-over): The base is home to the U.S. Air Force's F-22 Raptor, the most advanced fighter jet in", "input_text": " Korean leader Kim Jong-Un. Green camo on huge transporters.\n\nIf he, indeed, has a missile big enough to reach the United States, and if he can make a warhead small enough to fit on it, will the U.S. be forced the respond?\n\nThis hour, all the angles on the most important story in the world right now. And the urgent questions: How far along is North Korea's nuclear program? Are the new missiles we saw at Saturday's parade real? Does Donald Trump have a firm red line? Is there room for negotiation or are Trump and Kim on a collision course?\n\nVice President Mike Pence arrived here in Seoul today, the start of a ten-day Asia tour. And with China warning both sides to cool it, North Korea is at the top of the administration's agenda. In a moment, we'll talk to Trump National Security Adviser H.R. McMaster, a powerful voice in the president's inner circle. He's in Afghanistan, where the United States just proved what it can do with the \"mother of all bomb\" strikes on ISIS, a dramatic show of American firepower.\n\nBut we begin with North Korea's warning that it will annihilate military bases here in South Korea. They said in minutes if the U.S. tries to take out its nuclear program.\n\nWe visited the most important of those front line bases, Osan Air Base, just south of Seoul, 48 miles from the border with North Korea. We got exclusive and unprecedented access.\n\n(BEGINVIDEOTAPE)\n\nRADDATZ (voice-over): If North Korea pulls the trigger this could be target number one -- Osan Air Base, just 48 miles from North Korea, well within range of Kim Jong-Un's existing arsenal of missiles carrying god knows what.\n\n("} +{"output_text": " be deemed to be a threat to U.S. interests.\n\nThe U.S. Treasury has been trying to get foreign central banks to move their gold to London, but the London Gold Pool has been reluctant to do so. The U.S. Treasury has been trying to get foreign central banks to move their gold to the Bank of England, but the Bank of England has been reluctant to do so.\n\nThe U.S. Treasury has been trying to get foreign central banks", "input_text": "\nOf all areas of global power politics today, international finance and foreign investment have become the key flashpoint. International monetary reserves were supposed to be the most sacrosanct, and international debt enforcement closely associated.\n\nCentral banks have long held their gold and other monetary reserves in the United States and London. Back in 1945 this seemed reasonable, because the New York Federal Reserve Bank (in whose basement foreign central bank gold was kept) was militarily safe, and because the London Gold Pool was the vehicle by which the U.S. Treasury kept the dollar \u201cas good as gold\u201d at $35 an ounce. Foreign reserves over and above gold were kept in the form of U.S. Treasury securities, to be bought and sold on the New York and London foreign-exchange markets to stabilize exchange rates. Most foreign loans to governments were denominated in U.S. dollars, so Wall Street banks were normally name as paying agents.\n\nThat was the case with Iran under the Shah, whom the United States had installed after sponsoring the 1953 coup against Mohammed Mosaddegh when he sought to nationalize Anglo-Iranian Oil (now British Petroleum) or at least tax it. After the Shah was overthrown, the Khomeini regime asked its paying agent, the Chase Manhattan bank, to use its deposits to pay its bondholders. At the direction of the U.S. Government Chase refused to do so. U.S. courts then declared Iran to be in default, and froze all its assets in the United States and anywhere else they were able.\n\nThis showed that international finance was an arm of the U.S. State Department and Pentagon. But that was a generation ago, and only recently did foreign countries begin to feel queasy about leaving their gold holdings in the United States, where they might be grabbed at will to punish any country that might"} +{"output_text": " read our Privacy Policy here. Sign up for Take Action Now and get three actions in your inbox every week.\n\nThank you for signing up. For more from The Nation, check out our latest issue\n\nSubscribe now for as little as $2 a month!\n\nSupport Progressive Journalism The Nation is reader supported: Chip in $10 or more to help us continue to write about the issues that matter. The Nation is reader supported: Chip in $10 or more to help us continue", "input_text": " his staff, and it was decided that the Warriors would do less, not more. The big picture always seems to win.\n\n\u201cWe did a defensive segment that we shortened to six minutes, instead of 12,\u201d Kerr says. \u201cAnd then we did some skill work, conceptual work that\u2019s not going to tax them but give them a good groove and work up a sweat.\n\n\u201cIt\u2019s just a feel thing. It helps that I played.\u201d\n\nWhat the Warriors are doing isn\u2019t exactly revolutionary. The concept is not that different from the approach taken by the NFL Seattle Seahawks and their coach Pete Carroll, someone from whom Kerr has borrowed a few theories.\n\nThere is an acknowledgment of the needs of the players, on and off the court, something Kerr picked up from two of his coaches, Phil Jackson in Chicago and Gregg Popovich in San Antonio, two of the top five coaches in NBA history.\n\nKerr believes in strategically resting players in certain games and going easier on the veterans, occasionally excusing them from even the usual light drills.\n\n\u201cFrom a player\u2019s standpoint, especially an older player, you look at is as being able to preserve and prolong your career,\u201d Livingston says. \u201cIt\u2019s amazing. It\u2019s the best job in the world, but we only get so long to do it. There\u2019s a window. But if you can increase that window, what more could you ask for as a player?\n\n\u201cThis situation provides a unique opportunity to do that. Steve likes to make sure guys are fresh for the games and the playoffs, so he doesn\u2019t run us into the ground.\u201d Ready to fight back? Sign up for Take Action Now and get three actions in your inbox every week. You will receive occasional promotional offers for programs that support The Nation\u2019s journalism. You can"} +{"output_text": " were blocked. The driver took him to the wrong one, and the volunteer had to pay for the cab ride back.\n\nThe seafarers' center at Felixstowe, Britain's largest port, carries union jack and London-themed souvenirs for sailors who can't spare five hours to get to the real capital, a two-hour train journey away.\n\nThe seafarers' center at Felixstowe, Britain's largest port, carries union jack and London-", "input_text": " a godsend, because most are in or near ports, and seafarers lack the time or money to go farther. The missions usually provide Internet access, a bar, and souvenirs for time-strapped sailors. The popular seafarers' center at Felixstowe, Britain's largest port, carries union jack and London-themed souvenirs for sailors who can't spare five hours to get to the real capital, a two-hour train journey away. It also sells chocolate: Italians and Croatians love Cadbury; Indians prefer Quality Street.\n\nWhen there is no mission, entrepreneurs sometimes step in. At Rotterdam, where the seafarers' center is 25 miles and a $140 taxi fare away, transport is provided to a nearby duty-free store. At first, I thought the shop was called Bootleg, which was satisfyingly nautical, but in fact it is Botlek, after an area of this huge port. Other wharves are named after rivers: You can be berthed at the Yangtze, Mississippi, or Amazon. Transport to Botlek is free if customers spend $140, but given the prices\u2014twice that of supermarkets, making the duty that is free irrelevant\u2014that shouldn't be hard.\n\nSeafarers are used to being exploited. At sea, the captain moans at chandlers who supply ships with green bananas that will never ripen; at fruit that goes moldy obscenely fast; at sub-standard meat. He swears that he used to see meat stamped with \"For Merchant Navy only.\" Afloat and ashore, seafarers, with constricted time, little local knowledge, and ready cash, are easy marks. A volunteer at the Liverpool mission tells me of a taxi driver who claimed that both tunnels under the River Mersey"} +{"output_text": " mouse button was pressed. This was a regression from GIMP 2.8.x.\n\nThe Free Select tool now creates a preliminary selection immediately when the mouse button is pressed.\n\nThe Free Select tool now creates a preliminary selection immediately when the mouse button is pressed.\n\nThe Free Select tool now creates a preliminary selection immediately when the mouse button is pressed.\n\nThe Free Select tool now creates a preliminary selection immediately when the mouse button is pressed.\n\nThe Free Select tool", "input_text": " of a new Windows 10 feature. Windows 10 users indeed now have the ability to install fonts without admin permissions since a recent update.\n\nTherefore we added this non-admin font folder to our search path when running GIMP. It should be only a temporary workaround since eventually this should be supported by fontconfig, the upstream library used to manage fonts.\n\nNote also that it is not fully tested because of our lack of Windows developers. Therefore we are mostly hoping it will work as expected, and this is a good time to make a call again:\n\nAre you a Windows developer? Do you love GIMP? Please contribute!\n\nSeriously, none of our current developers use Windows and bugs are piling up in our bug tracker for this platform (same can be said on macOS by the way), whereas GIMP is so enjoyably stable on GNU/Linux. We are happy to do the occasional good deeds, but there are limits to what we can do for a platform we don\u2019t use. On the other hands, we happily welcome patches and new contributors!\n\nFaster painting\u00b6\n\nGIMP now doesn\u2019t replace the paint buffer on every dab if the paint color/pixmap hasn\u2019t changed. This results in faster painting on specific cases.\n\nAs a by-product of the change, the color-from-gradient dynamics is now fixed when the image has a color profile.\n\nIncremental mode in the Dodge/Burn tool\u00b6\n\nThe Dodge/Burn tool got a new \u201cIncremental\u201d option which, similarly to the Paintbrush, Pencil, and Eraser tools, applies the effect incrementally as the pointer moves.\n\nFree Select tool creates preliminary selection\u00b6\n\nOne of GIMP 2.10.0 changes which annoyed many people was that the Free Select tool was not creating a selection immediately when the"} +{"output_text": "\n\n\u201cI\u2019m not sure I\u2019m comfortable with this,\u201d I said. \u201cI\u2019m not sure I\u2019m comfortable with you teaching me how to fight.\u201d\n\n\u201cYou\u2019re not comfortable with me teaching you how to fight?\u201d Lieutenant Davis asked. \u201cI\u2019m not comfortable with you teaching me how to fight. I\u2019m not comfortable with you teaching me how to fight. I\u2019m not comfortable with you teaching me how to fight. I\u2019m not", "input_text": " Then she stepped forward and turned around to face us. \u201cWhen Riko gets here with some experienced spear-users, we\u2019re going to spar as a group. The lieutenant and I are the only ones here with training about how to handle ourselves in a fight with another person, so we\u2019re going to each lead a team. You two will consult with the two of us and give us ideas. Riko will officiate. Just like we were planning to do in the tent, over a map, but with live people.\n\n\u201cI\u2026 What? You want us to help teach you how to fight?\u201d I couldn\u2019t quite grasp the idea. I could imagine, in some way, that someone had to teach law enforcement officers how to subdue people, and obviously Ma had been taught to fight long ago, but I certainly didn\u2019t know enough about fighting to teach anyone.\n\nLieutenant Davis\u2019s voice cut through my confusion. \u201cNo. Fobi and I already know how to fight people. The experienced spear hunters know how to fight animals. Your job is to give us ideas about how to fight many people all at once. There are law enforcement tactics for dealing with large groups of disorderly people, but that\u2019s all about containment and arrest, not about defending and attacking. In fifteen years on the force, I\u2019ve only had to use that training twice, and the tactics we used worked, so there hasn\u2019t been much need for me or anyone else to think about it much.\u201d\n\nI turned away from Fobi to face the lieutenant, and saw him removing a padded end from a blunt spear before putting it back on and tying it back in place. He then set the spear on the ground to his left side, and picked up another from his right side, quickly untying the padded end and repeating the process."} +{"output_text": " them):\n\nsudo bash\n\n26. Type in the following commands exactly as shown below (or copy paste them):\n\nsudo apt-get install tor\n\nsudo apt-get install tor-geoipdb\n\nsudo apt-get install tor-geoipdb-extra\n\nsudo apt-get install tor-geoipdb-extra-dbg\n\nsudo apt-get install tor-geoipdb-pcre\n\nsudo apt-get install tor-geo", "input_text": "1\n\n18. Set the port for \"SOCKS Host\" to 9050\n\n19. Make sure that \"SOCKS v5\" is selected at the bottom.\n\n20. Click \"Ok\" and then \"Close\"\n\nNow we have instructed Firefox to use TOR. However, Firefox cannot use TOR yet. Right now, Ubuntu is completely unable to connect to the Internet. We\n\nare going to change that.\n\n21. Go to \"Applications\" -> \"Accessories\" -> \"Terminal\"\n\n22. Type in: sudo bash (and hit enter)\n\n23. Type in your password if prompted.\n\n24. Type in the following commands exactly as shown below (or copy paste them):\n\nifconfig eth0 10.0.3.2\n\n/etc/init.d/polipo stop\n\n/etc/init.d/tor stop\n\n/etc/init.d/privoxy stop\n\n(Note: the last three commands, those beginning with /etc/ are only necessary if you installed Torchat)\n\nNow you have told your Ubuntu machine to join the same network that your Firewall is on. Now we can establish a tunnel for TOR data to flow from our\n\nUbuntu machine, through the Firewall, into your Windows guest machine. We need to establish two such tunnels.\n\nThe first tunnel for port 9050 data, and the second tunnel for port 8118 data. When these two tunnels are set up, it will be possible for you to use\n\nyour Ubuntu machine to access any website using TOR. Further, it is still completely impossible for your Ubuntu machine to access the Internet in any\n\nother way.\n\n25. Your terminal window should still be open. Type in the following command exactly as shown (or copy paste"} +{"output_text": " of them.\n\nWhen you are in a life or death situation, you need to be able to control the subject\u2019s arm. This is not always the case. If you are in a life or death situation and you are unable to control the subject\u2019s arm, you will be in a life or death situation.\n\nWhen you are in a life or death situation, you need to be able to control the subject\u2019s arm. This is not always the case. If you", "input_text": " for those of us focusing on coming to grips with our opponents on the mat.\u201d\n\nStephan Kesting\n\nMost of the time, the hands are the first thing to come in contact with someone else. Chances are, if you are in law enforcement or corrections, your hands are used every day. The strength of you hands, wrists, and forearms can either ensure a successful restraint or put you in severe danger if they are your weak link. This article will approach grip from a different view point. Instead of a training routine on how to close grippers, pinch plates, or lift the Inch Dumbbell, I will attempt to explain functional grip training for life or death situations.\n\nLaw Enforcement/Corrections\n\nApprehending, restraining, and detaining a subject. This is the core of your profession. You have to \u201clay hands\u201d on someone almost every day. In the field of corrections, chances are it IS everyday. Most of the time the subject is going to go quietly. You will have no problems and after 8-10 hours, you can punch out, and head home. Unfortunately, this is not always the case.\n\nWhen someone makes the decision to fight back, strong hands are going to make all the difference. The ability to grasp and control a subject\u2019s arm will give you the upper hand and protect you from hand strikes and/or striking or stabbing weapons. If you lack the ability to hang onto a fighting subject, you will put yourself and those around you in danger.\n\nWhen faced with a subject who has made the decision to fight, you are often limited on what to do. Of course, the main objective is to get the subject onto the ground and restrained as quickly as possible. Hand, wrist, and forearm strength will enable you to grab and control the subject, regardless of how you get a hold"} +{"output_text": " smart contracts. We need a language that\u2019s easy to use and understand.\n\nThirtyK: What are the biggest challenges you\u2019re facing?\n\nIvanovs: We\u2019re still in the early stages of development. We\u2019re still building the infrastructure. We\u2019re still building the product. We\u2019re still building the team. We\u2019re still building the community.\n\nThirtyK: What are the biggest opportunities you\u2019re seeing?\n\n", "input_text": "\nThirtyK: In FIC\u2019s white paper, you say a key issue with cryptocurrency is that it has no opportunities for fixed income investment. Why is that a problem?\n\nIvanovs: Every developed economy needs healthy credit markets, the ability to lend with interest to entrepreneurs who want to build something.\n\nAs crypto markets mature, a lot of companies will be willing to raise capital if it\u2019s stable enough. On the other side, there\u2019s this vast number of individuals and organizations who hold a lot of cryptocurrency in wallets. If your crypto is in your wallet, it doesn\u2019t [accelerate] the overall economy, just like holding dollars in your wallet doesn\u2019t either.\n\nWe\u2019re trying to build an infrastructure for people to invest their crypto into interest-generating products \u2014 corporate bonds, asset-backed securities, and loans. Capital is spread across the world. It\u2019s hard to reach Japanese investors if you have a company in Brazil. There\u2019s a vast number of investors who have accumulated wealth but don\u2019t have the means to invest it, and a network of companies who would be able to raise money to expand their operations.\n\n1,500 Transactions Per Second\n\nThirtyK: How do you compare yourselves to other players in the space, including established financial institutions?\n\nIvanovs: We are building something from the ground up for the capital markets. We\u2019re one of the first ones to tackle this issue, and we have the upper hand in the way we have structured our governance around the blockchain.\n\nThirtyK: What about your governance structure is suited for capital markets?\n\nIvanovs: It feels like everyone is working on ethereum (ETH) and its complicated smart contracts language. I\u2019ve been vocal that capital markets don\u2019t need"} +{"output_text": " would be interesting if he was messing with the nature of time. So I started thinking about that. And then I started thinking about the nature of time. And then I started thinking about the nature of space. And then I started thinking about the nature of the universe. And then I started thinking about the nature of the multiverse. And then I started thinking about the nature of the multiverse. And then I started thinking about the nature of the multiverse. And then I started thinking about", "input_text": " long I have a team that's got my back. A really, really good team. Obviously Vijaya has been my partner since before my first issue. Vijaya and I actually came up with the plan together for Cartoon Books, to do a black and white comic book every two months and each one will be a chapter in this larger novel. Vijaya knew the ending of the story before we even started. I've got Kathleen, who I mentioned earlier, Kathleen Glosan, who's helped Vijaya run the office. She's the contact with most of the outside world. She sets up publicity, takes care of things, makes sure what I'm supposed to be doing. I've got Bone but about any art-related job that needs to be done at Cartoon Books that isn't actual comic book pages. I still have to do those by myself. And we have\n\n\n\nSPURGEON: Tell me about the creative part of moving into a final chapter on a book like RASL. How much of it is figuring out the book in addition to deciding where you want to take it? How much is learning where the book wants to take you?\n\n\n\nSMITH: I start out thinking I know what the ending is and where I'm going. And I do. Mostly. The ending will be the same ending. But as you write, especially a serialized book, which is what comic books do really well, I think, the story grows. You get ideas as you're going along. In RASL, there's that spooky little girl that doesn't talk. She was not in my original plan. She just kind of popped onto the page one time. I was suggesting that Rasl, by going to different universes and traveling back and forth, was altering something. He was messing with nature. It popped into my head that it"} +{"output_text": " LSSP/CP, they have not been involved in the armed struggle. They are not a homogeneous group. They have their own political and cultural identity. They are not a homogeneous group. They have their own political and cultural identity.\n\nFourth, there are the so-called \u201cTamil-speaking Muslims\u201d who are not a homogeneous group either. They are a combination of native and Indian Tamils. They are not a homogeneous group either. They have their own political and cultural identity", "input_text": " fair play from them.\n\nLater, however,the LSSP/CP leaders made a fundamental mistake and joined the capitalist government of Mrs Sirima Bandaranaike. Once in power they also became pawns in the hands of Sinhala chauvinism. In fact Dr Colvin De Silva, who once said \u201cOne language - two nations\u201d became the man who formulated the Buddhist theocratic constitution. This total betrayal led to much disillusionment among Tamil youth. Failure of the old left movement against Sinhala Buddhist chauvinism opened the stage for violence against the Tamil people. Racial riots broke out several times, the worst of which was in 1983. All this led to the liberation struggle of the Tamil people.\n\nToday, the Lankan Tamil nationality is a complex entity. First, the so-called native Tamils of the north and east have developed as Eelam Tamils with a recognisable homeland. Their national consciousness has developed to a high degree among them. The existing armed struggle for Tamil liberation is based on them. Under the memorandum of understanding signed by Ranil Wickremesinghe in 2002, more than one-third of the Tamil homeland was accepted to be under the direct control of the LTTE.\n\nSecond, the plantation Tamils of the up-country, descendents of south Indian Tamil workers brought to Lanka by the British Raj, have not identified entirely with the Eelam liberation struggle. Certainly the youth is highly influenced by the struggle in the north and east. But their demands are different, being very largely socioeconomic. Land and citizenship rights, greater autonomy for Tamil areas, and greater Tamil participation in local administration are some of their demands.\n\nThird, there is a substantial Tamil community living in Colombo and its suburbs. They are a combination of native and Indian Tamils. Except for the"} +{"output_text": " Horcrux. He was the first to be created by Voldemort, and the first to be destroyed. The final Horcrux was the Elder Wand, which was the most powerful of all the Horcruxes, and the only one that Voldemort could not destroy.\n\nThe Elder Wand is a powerful object, and one that Voldemort was not able to destroy. It is a symbol of power, and a symbol of immortality. It", "input_text": " shares an equal blame in not only imparting this information, but elaborating on the concept and framing Riddle\u2019s interest as evidence of his unique mind. The fact that Slughorn buried and hid the memory so deeply that Dumbledore had to resort to using Harry to draw the truth forth, alludes to the fact that Slughorn too recognised that he had started Riddle on the path to becoming Voldemort. However, the desire for immortality was already in place from an early age and tied to his mother\u2019s death. A young Riddle argued to Dumbledore that his mother could not have been a witch, or else she would not have died.\n\nIn total Voldemort creates eight Horcruxes. The first, a diary Riddle utilised as a Horcrux also preserved part of his adolescent identity, an incredibly patient, manipulative and brilliant villain. Not only did the tool allow for the Chamber of Secrets to be reopened and wreak havoc on the school, it also ensured that it did so with a maximum amount of emotional and psychological damage to the user, and drew on racist fears within the wizarding community. He then creates one in the bloodshed of the Riddle family massacre using Marvolo Gaunt\u2019s Ring. The next Horcrux was Rowena Ravenclaw\u2019s Diadem. It is an insightful choice, selected to undermine the purity of an object of iconography held in high esteem, it was also a choice that would ensure that piece of his soul is protected as an object of such historical value would always be protected. He retrieved two more Hogwarts artefacts, Salazar Slytherin\u2019s Locket and Helga Hufflepuff\u2019s Cup, to turn into Horcruxes. The next was unintentional, Harry Potter himself became a"} +{"output_text": " for the 3D kernel. We didn't need to reverse-engineer the 3D kernel because it was already in the public domain.\n\nWhat is the status of the project?\n\nRodrigo: We are working on version 5.1 of the spec. We are also working on a new version of the software. We are also working on a new version of the website.\n\nWhat is the status of the project?\n\nRodrigo: We are working on", "input_text": " 3D kernel that is possibly licensed by Autodesk (and other companies) and embedded into AutoCAD. Sometimes it is text and sometimes it is binary. When it is encoded as text, it is almost human-readable. With enough samples one could reverse-engineer it without much effort.\n\nWe were able to decode 3D objects as a text block almost by accident: the spec wasn't clear enough, and there was a silly hack to make it not human-readable at a first glance. We found it out when an Ukrainian hacker sent a message to our mailing list. His English was very poor and we couldn't figure out what he meant until he replied with a code snippet in C.\n\nCurrently, LibreDWG can decode R13, R14 and DWG 2000 files. DWG 2004 support is almost ready, but we still need to work on 2007 and 2010. Until version 5.1 of the spec there wasn't enough information to implement it because either the document was corrupted or it had several sections intentionally omitted. Write support is partially implemented, but reading is our priority.\n\nJudging by last year's discussion in the mailing list, you never made any formal contacts to ODA, so it's not quite clear whether they think their spec is used legally. Is it correct?\n\nRodrigo: No, we never made any formal contacts to ODA. They published the specification in a public area on their website, we read it and it helped us write our software. We never used their spec in any other way than that. If they didn't want people to do it, they wouldn't have published it in the first place.\n\nHow much of the work is based on the spec and how much did you have to reverse-engineer yourself?\n\nRodrigo: We didn't do much reverse-engineering work, except"} +{"output_text": " to do than checking for syntax errors. In fact, the Parser can check for direct calls to eval without actually executing the code. This is because the Parser can use the AST to check whether the function contains a call to eval .\n\nThis is a very important property of the AST, because it means that the Parser can check for direct calls to eval without actually executing the code. This is a very important property of the AST, because it means that the Parser can check for", "input_text": ". In this setting, the AST is a proof that the syntax has been checked. This proof is used by the the Bytecode Emitter, which will itself produce Bytecode, etc.\n\nNow, all of this is important to us because some of the verifications that the JavaScript VM needs to perform have a strong impact on performance.\n\nDirect evals and indirect evils\n\nIn addition to checking the syntax, the Parser must check many other, more subtle, properties of the program. For instance, consider the most dynamic construction in the JavaScript language: eval. This magic function takes a string and executes it immediately, as if it were a source code. From the point of view of VM implementors, it\u2019s scary because it can change the structure of the program itself in interesting and unpredictable ways:\n\n( function () { var a = 10 ; function foo ( code ) { eval( code ); return a ; } console. log ( foo ( \"\" )); // \"10\" console. log ( foo ( \"var a\" )); // \"undefined\" })();\n\n(seriously, don\u2019t use eval ).\n\nBecause eval is so powerful, making eval work in a JIT is very hard. Hard enough that it basically cannot optimize a function if it thinks that there may be a direct call to eval lurking somewhere in this function.\n\nFor this reason, in addition to checking that the syntax is valid, the Parser also checks which functions contain a direct call to eval. This information is stored in the AST and used later during the execution to decide which functions can be optimized and which cannot. In other words, the AST also serves as a proof that checking whether a function contains eval has been performed.\n\nOne of the reasons for which we don\u2019t like checking for direct calls to eval is that it\u2019s actually harder"} +{"output_text": "\u2019t do nothing about that.\u201d\n\nAD\n\nLove-Robinson\u2019s grandfather also told the Sun-Sentinel that he had been in contact with the teen\u2019s mother, who was in the hospital at the time of the alleged theft.\n\nAD\n\n\u201cShe\u2019s a good mother,\u201d Robinson said. \u201cShe\u2019s a good woman. She\u2019s a good Christian woman. She\u2019s a good mother. She\u2019s a good wife. She\u2019", "input_text": " Department of Health\u2019s license verification database.\n\nThen, of course, there\u2019s the fact that Love-Robinson looks so young.\n\nAD\n\nAD\n\nAfter getting another complaint about the supposed him in January, Florida health and anti-narcotics authorities set up a sting on the clinic this month, according to WPBF. An unnamed detective with the Palm Beach County Sheriff\u2019s Office told the TV station that an 86-year-old woman had contacted them after Love-Robinson visited her home when she was feeling ill.\n\nHe told her she needed to go to the hospital, and as she was loaded into an ambulance, she said she asked him to return her purse to her house, the woman said. Instead, investigators believe, he allegedly stole her checks and had been cashing them around town.\n\nBut Love-Robinson\u2019s grandfather William Robinson says this is all a misunderstanding. He told the Sun-Sentinel that the teen never claimed to be a medical doctor, just a holistic doctor, and that he held certificates to practice from online schools.\n\nAD\n\nAD\n\n\u201cHe was pursuing things, but I don\u2019t really know what it came to,\u201d Robinson said. \u201cHe was pursuing the field that he wanted to get into.\u201d\n\nHe denied that Love-Robinson had received any patients at his clinic, adding that the goal was to set up the clinic and then attract licensed doctors to come work there. Robinson also told the Sun-Sentinel that his grandson was an \u201cup-and-coming minister\u201d at the church where Robinson himself is pastor.\n\n\u201cHe\u2019s not out doing drugs, he\u2019s not out trying to rob nobody,\u201d Robinson said. \u201cHe\u2019s trying to do something constructive, and if he did do something and the paperwork wasn\u2019t right \u2014 he can"} +{"output_text": " it will be staggering.\n\nThe coronavirus has not just changed the way we live, it has changed the way we die.\n\nThe coronavirus has not just changed the way we live, it has changed the way we die.\n\nThe coronavirus has not just changed the way we live, it has changed the way we die.\n\nThe coronavirus has not just changed the way we live, it has changed the way we die.\n\nThe coronavirus has not just changed the way we", "input_text": " ignored it for decades, not to mention the family disintegration and the drug epidemic it has spawned.\n\nSo far, about 10,000 Americans have died from the Wuhan coronavirus. That number will rise, and it will likely include people you know. That's a tragedy. But it's not the only tragedy in progress in this country.\n\nIn 2018, more than 67,000 Americans died of drug overdoses. The year before, more than 70,000 died. That's more than the entire population of the towns most of us grew up in. And those totals are far lower than the real number, according to people who study the question.\n\nThe drug epidemic has permanently changed the demographics of this country. But for some reason, CNN has not kept a running tally of drug casualties on the screen. Why is that? Well, you know why. It's not their peer group. It doesn't seem real. They're not that interested. And the same thing is going on now.\n\nIf the coronavirus shutdown was crushing college administrators or nonprofit executives or green energy lobbyists, it would have ended last week. Instead, it's mainly service workers and small business owners who have been hurt, and they're not on television talking about what they're going through. You need to look closely to see their suffering.\n\nOver the weekend, the head of Indiana's Family and Social Services Administration announced the calls to the state's mental health and suicide prevention hotline had gone from about 1,000 to 25,000 a day.\n\nCalls to Indiana's addiction hotlines have risen dramatically as well. Reports of domestic violence have spiked in this country and in fact, around the world. In France, they rose 32 percent in a single week. Someday, we will get the numbers on the child abuse going on during this lockdown, and"} +{"output_text": "?\n\nSusan Eisenberg: I am excited for the merchandise to come out. I am excited for the movie to come out. I am excited for the movie to be successful. I am excited for the movie to be a success. I am excited for the movie to be a success. I am excited for the movie to be a success. I am excited for the movie to be a success. I am excited for the movie to be a success. I am excited for the movie to", "input_text": "Susan Eisenberg: I went off what I knew and what was on the script. I didn't grow up with comics except for maybe Archie Comics. If I am working on a project I read the script and listen to what the writer has to say and what the director has to say and kinda go from there. Twitter has really opened me up to the culture through fans and other artists and what people are loving about the character right now.\n\nComiConverse: Do you have many family members or people that run up to you and ask you to do the voice?\n\nSusan Eisenberg: The irony is, when I was doing the show (Justice League) my nephews were all about the Flash. Although when their friends would go over to their house and I happened to be visiting and my nephew would introduce me they would be like \"Dude, do you know who she is?\" and my nephews would respond, \"Yeah, thats my aunt?\".\n\nFor their friends it was exciting, but for my nephews I was just their aunt. I try not to do the voice to often because I don't want people to look at me when I do the voice. I don't want them to see me when they hear the voice with my blonde hair and green eyes with no tiara...on a weekday. I don't want them to be disappointed. \u00c2 It such a fragile thing when someone loves a character and I want to have the kids keep their idea of the character.\n\nComiConverse: One of the big things DC is doing is making Wonder Woman their top merchandising focus, along with Batman and Superman, for the Batman v. Superman: Dawn of Justice movie. This is big deal for many because you finally get to see is a female superhero being properly promoted.\n\nAre you excited for the merchandise to finally come out"} +{"output_text": " connection created\" ); return connection ; } private void DeallocateConnection ( IDbConnection connection ) { connection. Dispose (); } }\n\nThe above code is a bit more verbose than the original code, but it\u2019s a lot more readable. The key takeaway is that we\u2019re now able to see the actual consumer of the context.\n\nThe above code is a bit more verbose than the original code, but it\u2019s a lot more readable. The key takeaway is", "input_text": "ve reduced the actual code used which also entailed pooling / throttling for sake of brevity, but the following should give you an idea of how it works:\n\n1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 public class DiagnosticsConnectionFactory { // this is the connection provider that was originally // used before we added this diagnostics provider. private readonly OriginalConnectionProvider _originalProvider ; private readonly ILogger _logger ; public DiagnosticsConnectionFactory ( OriginalConnectionProvider originalProvider, ILogger < DiagnosticsConnectionFactory > logger ) { _originalProvider = originalProvider ; _logger = logger ; } public IDbConnection CreateConnection () { // the int provided to the StackFrame constructor determines how many frames // to skip up the stack. Zero would point to this method, each number greater // to further up the stack. We found it most useful to move two frames up the // stack because one frame would typically simply point us to the ctor of an // entity framework context. What we actually wanted to see was the consumer // of said context. Tweak this number or do something more intelligent as // you deem fit. var stackFrame = new System. Diagnostics. StackFrame ( 2 ); var allocateeMethod = stackFrame. GetMethod (); var allocatee = $\" { allocateeMethod. DeclaringType?. Name?? \"Unknown\" }. { allocateeMethod. Name } \" ; var connection = new DiagnosticsDbConnection ( _originalProvider. CreateConnection, DeallocateConnection, allocatee ); _logger. LogDebug ( \"Database"} +{"output_text": " not stop talking about the case.\n\nStone has said he will not testify in the case, but he has not said whether he will assert his Fifth Amendment rights.\n\nThe case is expected to last several more weeks.\n\nThe trial is the first of several that are expected to be held in the coming months as part of the special counsel\u2019s investigation into Russian interference in the 2016 election.\n\nThe trial is also the first of several that are expected to be held in the", "input_text": " coronavirus concerns The Intercept's Glenn Greenwald discusses U.S. case against Assange MORE, but the prosecutors alleged that was a lie in order to protect the InfoWars-affiliated conspiracy theorist Jerome Corsi.\n\nEvidence presented by prosecutors shows that Stone repeatedly pressured Credico not to cooperate with the House committee\u2019s investigation. Credico ultimately decided to assert his Fifth Amendment rights after the panel issued him a subpoena.\n\nCredico said on the stand that Stone\u2019s influence played a role in that decision. The jury saw emails and text messages between the two showing Stone hurling invective at Credico, who was urging his friend to correct his testimony.\n\n\u201cYou are a rat,\u201d Stone wrote to him in April 2018. \u201cA stoolie. You backstab your friends-run your mouth my lawyers are dying Rip you to shreds.\u201d\n\n\u201cI am so ready,\u201d Stone added. \u201cLet\u2019s get it on. Prepare to die, c--ksucker.\"\n\nStone\u2019s legal team argued that the self-described trickster was not trying to deceive Congress but that he believed the WikiLeaks controversy did not fit the House Intelligence Committee\u2019s parameters of its investigation into Russia\u2019s efforts to influence the election.\n\nADVERTISEMENT\n\nHis lawyers also argued that there was nothing improper in the Trump campaign seeking out information about a rival.\n\n\"In fact, so much of this case deals with that question that you need to ask: So what?\" Bruce Rogow, a member of the defense team, told the jury.\n\nStone's legal team did not call any witnesses and rested their case after playing an hourlong audio tape of his testimony before the House Intelligence Committee.\n\nStone's case became a spectacle for the political world.\n\nJackson imposed a gag order earlier this year because Stone would"} +{"output_text": " to see a lot of people that are starving. They are not going to have food. They are not going to have water. They are not going to have anything. They are going to die. And they are going to die because of the lack of food. And they are going to die because of the lack of water. And they are going to die because of the lack of medicine. And they are going to die because of the lack of everything.\n\nAnd the people that are in", "input_text": " Jair Bolsonaro, Pinera, Macri, and a lot of presidents of the region that are supporting free elections in Venezuela. But to have free elections, you have to first take Maduro from power. This is the first step. Maduro doesn\u2019t want to get out, not only because he\u2019s a dictator but he\u2019s a common criminal. He works with the drug dealers. He knows that once he takes off the power, it\u2019s his life. It\u2019s not only that \u201cOK, I will not be the president anymore. I will do another thing.\u201d No, no, no.\n\nHe knows that his life is at risk, even with the possibility of other friends that are around him kill him. Because if he says, \u201cOK, I will step out. I\u2019m going to live in Cuba or North Korea.\u201d It\u2019s not going to be that easy because there are a lot of people around that have their criminal business or terror business. They are not going to be happy with that. And they can do bad things, too. We know how criminals deal with each other. So it is a big issue. No one has the right solution. Everybody wants to do that. We filed military action in Venezuela. But we have somehow to twist the militaries inside of Venezuela. I think this is the big issue that everybody trusts that, \u201cOK. Now it\u2019s for real. Let\u2019s go do everything against Maduro because otherwise they are going to suffer with the starving, and they\u2019re going to die hungry.\u201d\n\nLast year, I went to the border between Brazil and Venezuela to a state called Roraima. From the capital of Roraima to the border, it is 200 kilometers. When you get in a car and go to the road and drive this 200 kilometers, you\u2019re going"} +{"output_text": " for his selfless act of conscience?\n\nThe US government has already made it clear that it will not be releasing Mr. Snowden to any country that has not been vetted by the US government. This is a clear indication that the US government is not interested in Mr. Snowden\u2019s well-being. It is only interested in punishing him for his act of conscience.\n\nThe US government has already made it clear that it will not be releasing Mr. Snowden to any country that", "input_text": " and technicalities but who otherwise have no hesitation in looting the public treasury or starting wars or killing their own citizens when their personal agendas or profit motives are involved. Apparently Mr. Snowden cannot travel because his passport has been (unlawfully, mind you) revoked. So what? Are we supposed to believe it is so difficult to issue him one by one of the other nations? Let me tell you, if he were a spy for some nation or even a low level operative helping the banksters in one of their looting schemes, TEN passports would have been issued to him instantly. This only goes to show that most governments today, \u201cdemocratic\u201d or otherwise either belong to or are subservient to the same criminal mafia. They have demonstrated that they aren\u2019t independent nations but mere colonies/puppet states of the US Empire. In fact, most of the political leadership of the world seems to be in cahoots with each other sharing the same bed. These so-called countries are mere prisons with their political \u201cleaders\u201d prison wardens appointed by and/or serving at the pleasure of the US government. They are happy to exploit their prison populations for personal enrichment but with the permission of and in collaboration with the US and the cartel controlling it. This is the only explanation for the deafening global silence and inaction in the Snowden case.\n\nMr. Snowden\u2019s Shameful Treatment\n\nLetting Mr. Snowden loiter indefinitely in a transit area, is this how the world, especially Russia, want to be seen and recorded by history as having treated those have made the ultimate sacrifice for the greater good? Whatever treatment the US government has in store for Mr. Snowden, he is already being punished by being forced to sit inside an airport terminal in God knows what conditions. Is this how the world intends to thank Mr. Snowden"} +{"output_text": "\nThe compiler is a new program, written in Go, that is configured by the environment variables GOOS and GOARCH. It is a genuine (machine assisted) translation, so the new programs are essentially the old programs translated rather than new ones with new bugs. We are confident the translation process has introduced few if any new bugs, and in fact uncovered a number of previously unknown bugs, now fixed.\n\nThe compiler is a new program, however; it is described below.\n\nThe", "input_text": " the packages net and crypto/x509, as well as a number of other fixes and improvements.\n\nTranslating\n\nAs part of the process to eliminate C from the tree, the compiler and linker were translated from C to Go. It was a genuine (machine assisted) translation, so the new programs are essentially the old programs translated rather than new ones with new bugs. We are confident the translation process has introduced few if any new bugs, and in fact uncovered a number of previously unknown bugs, now fixed.\n\nThe assembler is a new program, however; it is described below.\n\nRenaming\n\nThe suites of programs that were the compilers ( 6g, 8g, etc.), the assemblers ( 6a, 8a, etc.), and the linkers ( 6l, 8l, etc.) have each been consolidated into a single tool that is configured by the environment variables GOOS and GOARCH. The old names are gone; the new tools are available through the go tool mechanism as go tool compile, go tool asm, and go tool link. Also, the file suffixes.6,.8, etc. for the intermediate object files are also gone; now they are just plain.o files.\n\nFor example, to build and link a program on amd64 for Darwin using the tools directly, rather than through go build, one would run:\n\n$ export GOOS=darwin GOARCH=amd64 $ go tool compile program.go $ go tool link program.o\n\nMoving\n\nBecause the go/types package has now moved into the main repository (see below), the vet and cover tools have also been moved. They are no longer maintained in the external golang.org/x/tools repository, although (deprecated) source still resides there for compatibility with old releases.\n\nCompiler\n"} +{"output_text": "KHz filter), with the filter in place. The filter is mounted on the stand-offs, and the filter capacitors are mounted on the board. The filter is a simple 3-pole Butterworth, with a 3-turn pot for the cutoff frequency. The filter is mounted on the stand-offs, and the filter capacitors are mounted on the board. The filter is a simple 3-pole Butterworth, with a 3-turn pot for the cutoff frequency. The filter is mounted on the", "input_text": " 10-turn wirewound pot instead, as I like the feel of those pots. From the front (as you will see in later photos) the 2 National knobs and escutcheon plates look the same. However, the knob on the left is connected directly to the 10-turn pot and not to a National Velvet Vernier reduction drive. The black escutcheon plate for the regen control is spaced away from the front panel by one washer thickness, and bolted to the front panel with 4-40 hardware. It is not used for anything, other than looks.\n\nNow, let\u2019s look at some of the boards. They were built, as always, with W1REX\u2019s very useful MeSQUARES and MePADS. This is the AF output stage and the 4KHz filter mounted on one board, and installed in the chassis. The idea was that this board, together with the main RF board, would form a working receiver, after which I could build and install the other filters, one by one \u2013\n\nMounted above the AF output stage, on the stand-offs, is this next filter board, carrying 2 LPF\u2019s. The first filter to be built was the 6KHz one \u2013\n\nNext came the 3KHz filter (in the foreground of the next shot). The grey rectangular poly capacitors were from Tayda Electronics. Thier prices are low, and the caps seem good. The resistors are 2 types \u2013 either 5% carbon film from the parts stash I had as a kid in England in the early 80\u2019s. They lasted a long time, but I am beginning to run out of them. The others are newly-acquired Xicon 1% metal film parts, purchased in lots of 200 from Mouser \u2013\n\nThe same board, taken from above (3"} +{"output_text": " about it here: https://flylitchi.com/help#autopano-added support for Altitude Hold mode- added support for Altitude Hold mode- added support for Altitude Hold mode- added support for Altitude Hold mode- added support for Altitude Hold mode- added support for Altitude Hold mode- added support for Altitude Hold mode- added support for Altitude Hold mode- added support for Altitude Hold mode- added support for Altitude Hold mode- added", "input_text": " gimbal angles would not always be correctly loadedVersion(March 29, 2018)- added lock/unlock feature for mission editing; when loading a mission it will be locked for editing by default- improved support for altitudes above ground level in waypoint mode- fixed bug where in some cases the POI setting of a waypoint would not save correctlyVersion(March 29, 2018)- added lock/unlock feature for mission editing; when loading a mission it will be locked for editing by default- improved support for altitudes above ground level in waypoint mode- fixed bug where in some cases the POI setting of a waypoint would not save correctlyVersion(Feb 13, 2018)- added support for altitudes relative to ground- added support for batch editing (using control/command + left click for multiple waypoints selection)- added new help section for Mission Hub: https://flylitchi.com/help#missionhub - added support for Visual Mission Planning with Google Earth Pro - altitudes now have up to 1 decimal of precision- improved support for CSV import/export- fixed bug where in some cases the POI setting of a waypoint would not save correctlyVersion(Feb 27, 2018)- fixed crash at startup on devices with intel processorsVersion(Feb 08, 2018)- added RTH/Land button in FPV mode- added Smart Return to Home general setting- added support for bluetooth controllers- misc bug fixes and improvementsVersion(Jan 18, 2018)- re-enabled Infinity Focus (Mavic Pro only)Version(Jan 11, 2018)- added support for x7 camera- updated to DJI SDK 4.4 which fixes the crash on 32 bit devicesVersion(Dec 22, 2017)- speed improvements for panoramas shot with Pano mode- added Auto Pano and Panorama database features (can be used for in-app panorama stitching and sharing), learn more"} +{"output_text": " by the end of the year.\n\nThe city would then have to approve the appointment.\n\n'A lot of work'\n\n\"It's a lot of work,\" Jordan said. \"We're going to have to do a lot of work to get this done.\"\n\nThe settlement also calls for the city to pay $1.5 million to the plaintiffs' attorneys, $1.5 million to the NAACP Legal Defense Fund and $1.5 million to the Law", "input_text": ".\n\nA promising path\n\nCity leaders lauded the settlement Wednesday, saying it offered a promising path toward emerging, at long last, from federal court oversight.\n\n\"It's the best-case scenario for us based on the alternatives. We had always been opposed to a receiver,\" said Jordan, who took part in the negotiations. \"If you look at the duties and responsibilities, the compliance director will be charged with actually finding solutions and working with all stakeholders to get us into compliance.\"\n\nJordan said the difference between a receiver and compliance director is that under a receiver the city would have had no say in who that person would be or what they could do. With a compliance director, the city can nominate candidates for the job and work with the appointee to help meet goals.\n\n'Would have no say'\n\n\"With the receiver we would have no say in the future of our department,\" Jordan said.\n\nCivil rights attorney Jim Chanin, who helped craft the settlement along with attorney John Burris, said he sees little difference in the titles. He said it was important to city officials that the compliance director not be referred to as a receiver but called it a matter of semantics, saying, \"I see this as a receiver with a different name.\"\n\nHe said the compliance director's power to fire the chief is unlike any other department he knows of in the nation.\n\nBurris said he is \"cautiously optimistic that we can set a more positive direction for the department and get it into compliance. If this is done properly, this department will be in a position to engage in constitutionally sound policing.\"\n\nJordan said the city is in the process of identifying possible compliance directors, though it may be some time before that's decided. According to Wednesday's deal, the city and the civil rights lawyers would try to agree on a candidate"} +{"output_text": " enhanced with the addition of a Mk-41 (64 cell) rail-gun assembly, a Mk-41 (64 cell) CIWS, or a Mk-41 (64 cell) CIWS with a laser weapon.\n\n\u2022 DDG-1000: This is a new-design DDG with a Mk-41 (64 cell) CIWS, a Mk-41 (64 cell) CIWS with a laser weapon, and a Mk-41 (", "input_text": " the same kit for the next FRTP or switch to another depending on the combatant commander\u2019s requirements. The payload kit is designed to plug into the existing service connections in the hull, which would supply chill water, data connections, and electrical power buses in set locations ready to be uncapped and fitted to the kit\u2019s component modules. Sound familiar? This was how Mk-15 Phalanx close-in weapon systems (CIWS) were first installed. The difference here is that where CIWS was a stand-alone installation, the universal plug-and-play (UPP) modules are integral parts of the ship\u2019s sensor and weapon suites:\n\n\u2022 DDG kit: An air- and missile-defense radar (AMDR) in a composite housing that fits atop the existing superstructure, an additional Mk-41 (64 cell), rail-gun assembly, or laser weapon is fitted into the topside weapon bay. Platforms equipped with the DDG kit would replace the need for the Arleigh Burke Flight III and, over time, the current inventory of destroyers and cruisers. This kit meets the proposed requirements for a new-design area-defense surface combatant described in the Congressional Research Service report Navy DDG-51 and DDG-1000 Destroyer Programs: Backgrounds and Issues for Congress.3\n\n\u2022 DDE kit: This is an antisubmarine warfare (ASW) kit with a multifunction towed-array (MFTA) sonar and SLQ-25 Nixie towed torpedo decoy (already in their own self-contained modules) fitted in the payload bay aft under the flight deck and around the stern ramp. As part of this payload kit, Mk-32 surface-vessel torpedo tubes (SVTT) are fitted to the weather decks. This kit could be further"} +{"output_text": " visited her often, and one day I was sitting with her when she asked me to tell her a story. I told her about the time I was a kid and my dad took me to a game at RFK. I told her about the time I was a kid and my dad took me to a game at RFK. I told her about the time I was a kid and my dad took me to a game at RFK. I told her about the time I was a kid and my", "input_text": " in nine first-half minutes to demolish Real Salt Lake and confirm his MLS MVP credentials: Many of us in the RFK press box really thought the old house might crumble during that year\u2019s Gold Cup, when a sellout crowd saw El Salvador \u2013 whose national team has made the stadium their second home on account of the region\u2019s large expatriate population \u2013 duel Panama in a breathless quarterfinal clash. The passion, noise and well, vibration was palpable as the game went to PKs: In 2012, Englishman Lewis Neal scored the game-winning goal in a D.C. victory over Columbus that clinched a playoff spot for United, ending a painful five-year postseason drought and sparking a particularly visceral reaction from the home faithful: The US men's national team has also made hay at RFK, their most successful home venue of all time. They packed the place for a fun friendly with Germany to commemorate the US Soccer Federation\u2019s centennial in 2013: They also capped the 2010 World Cup qualifying cycle with highly dramatic draw vs. Costa Rica, just after Charlie Davies\u2019 nearly deadly car crash. Despite Bryan Ruiz\u2019s virtuoso display, Jonathan Bornstein\u2019s late header snatched qualification out of the Ticos\u2019 hands and gave it to Honduras instead: Many USMNT fans want one more game at RFK before it\u2019s torn down. Here\u2019s hoping it happens. \u2013 Charles Boehm\n\nReally, soccer is sometimes the last thing I think of when I think of RFK Stadium. Mostly, I think about the people, and the building itself. I think about how the place has aged over the years, how the paint has flaked away, how its cracks have started to show. A few years back, I lost a loved one. She was in her 90s and spent her final years in bed. I"} +{"output_text": "\nThis card is a Changeling, but it's a Puffer. Puffers are a weird creature type that are really good at being a Changeling. They have a lot of flexibility in what they can do, and they can be a lot of fun to play. The Puffer is a great example of this. It's a 3/3 for 3 mana, and it can be a 3/3 for 3 mana, a 3/3 for 2 mana, or", "input_text": " creatures give us a Skeleton and a Shade to put them on after all. As it turned out, the flavor of the regeneration abilities on Black Poplar Shaman and Mad Auntie ended up being fine as well.\n\nCR01_PEA\n\nRed Changeling\n\n3R\n\nCreature - Shapeshifter\n\n2/3\n\nChangeling (This card has all creature types, even when it isn't in play.)\n\nR: CARDNAME gets +2/-2 until end of turn.\n\nThis guy reminded me so much of Urza's Saga oddball Dromosaur that I was really hoping his changeling ability granted him the subtype \"Dromosaur.\" Sadly, Dromosaur was a Lizard. The Fire-Belly Changeling that this card became has two benefits over the Red Changeling design. First: Fire-Belly Changeling is a two-mana Giant, which helps the curve of Giant decks and is awesome with Blind-Spot Giant. Second: once upon a time the \"Elemental mana-activated abilities matter\" family of interactions between Soulbright Flamekin, Flamekin Brawler, and Ceaseless Searblades had a lot more emphasis as an Elemental theme than it does today. Fire-Belly Changeling provided extra mana-activated abilities in Flamekin decks, and could activate twice without dying, unlike the original \"Red Changeling.\"\n\nCG01_PEA\n\nChangeling Puffer\n\n3G\n\nCreature - Shapeshifter\n\n3/3\n\nChangeling (This card has all creature types, even when it isn't in play.)\n\n2G: CARDNAME gets +2/+2 until end of turn. Play this ability only once per turn.\n"} +{"output_text": " work. ID: 93188 \u00b7 Rating: 0 \u00b7 rate:\n\nNickf19\n\nSend message\n\nJoined: 2 Jul 17\n\nPosts: 1\n\nCredit: 97,410\n\nRAC: 76\n\nJoined: 2 Jul 17Posts: 1Credit: 97,410RAC: 76 Message 93190 - Posted: 3 Apr 2020, 9:00:00 UTC I'm glad to see that you are still working on this. I'm sure", "input_text": " big thank you to Rosetta@home for the inestimable bounty of being able to assist, even if only in such a simple thing as sharing my computer cycles. I am aware that Rosetta@home is one of a number of organisations and groups working to resolve both this current crisis, and the multiple other debilitating condition which affect humankind. To each and every one, kudo's and bouqets.\n\n\n\nIf the world stands together, then together it will overcome. If the world stands as singles, then each will fall and fail singly.\n\n\n\nRomane ID: 93166 \u00b7 Rating: 0 \u00b7 rate:\n\nNickf19\n\nSend message\n\nJoined: 2 Jul 17\n\nPosts: 1\n\nCredit: 97,410\n\nRAC: 76\n\nJoined: 2 Jul 17Posts: 1Credit: 97,410RAC: 76 Message 93182 - Posted: 3 Apr 2020, 8:43:18 UTC What an awesom news! I'm proud, that (even in a really little way) we are helping you scientist fighting this battle. Thanks ID: 93182 \u00b7 Rating: 0 \u00b7 rate:\n\nJJAR\n\nSend message\n\nJoined: 19 Oct 06\n\nPosts: 1\n\nCredit: 402,138\n\nRAC: 448\n\nJoined: 19 Oct 06Posts: 1Credit: 402,138RAC: 448 Message 93188 - Posted: 3 Apr 2020, 8:53:30 UTC I would like to help with my 4 ARM64 and also 6 ARM32 linux mini-boards. But unfortunately the minimum memory they need is near 2Gb and all of mine are modest with only 1Gb. Please consider to compile for ARM32 and also try to reduce the target memory of the programs.\n\nAnyway, thanks for your"} +{"output_text": " with the original Macintosh keyboard.* the Fn-layer is activated by pressing the Fn key. This is the same as the original Macintosh keyboard.* the Fn-layer is activated by pressing the Fn key. This is the same as the original Macintosh keyboard.* the Fn-layer is activated by pressing the Fn key. This is the same as the original Macintosh keyboard.* the Fn-layer is activated by pressing the Fn key.", "input_text": " and presented police officers with a 'ransom' note from her kidnappers: a controversial piece of evidence that has raised eyebrows for its odd language and length. But hours later, JonBenet was found dead in the basement with a rope around her neck and a crack in her skull.\n\nTheories have circulated for years about how the young girl was killed, with many saying JonBenet's parents murdered her and then tried to cover it up. Her parents maintained that she was killed during a botched kidnapping. Patsy died of ovarian cancer several years ago before any concrete conclusions about the death were made.\n\nBurke's lawyer warned, however, exactly what would happen to those who questioned his client's innocence.\n\n'Burke successfully sued every member of the tabloid media who accused him in 1998 and 1999 of killing his sister.\n\n'For the last 17 years, no one has been foolish enough to repeat that false accusation until Spitz chose to do so in the WWJ radio interview and with CBS in September of 2016,' he said in court documents. 11 Aug 2011, 13:34\n\nNKRO (6KRO right now)\n\nHand wired matrix with diodes\n\nSpring switch replacement: 61g --> 40g\n\nMini-USB connection with Teensy controller\n\nCase and keys retr0brited.\n\nColemak and Qwerty switchable\n\n2nd Miniguru-like layer*\n\nMuch quieter because of pcb removal and lighter switch springs\n\nMouseKeys\n\nStill have to try the Mousekeys feature which wouldn't work immediately.\n\nFinished my M0110 mod. This is the keyboard that came with the original Macintosh 128k.Features:* meaning the cursor keys are on a second Fn-activated layer.This makes the keyboard fully usable"} +{"output_text": " experience for the drinker.\n\nIn the past, coffee was a luxury item that was only enjoyed by the rich and famous. Today, coffee is a staple in many people\u2019s diets, and it is becoming more and more accessible to the average consumer.\n\nIn this article, we will take a look at some of the best coffee shops in Michigan. We will also take a look at some of the best coffee roasters in Michigan.\n\nThe Best Coffee Shops in Michigan", "input_text": "\u201cI have a reporter here from the The New Yorker,\u201d Stournaras told him. \u201cShall I put you on?\u201d\n\nStournaras activated the speakerphone setting so I could hear. Samaras\u2014laughing knowingly\u2014informed me that despite \u201cprevious ideological differences\u201d he and Stournaras share a common goal: keeping Greece in the eurozone.\n\n\u201cThat\u2019s what I told him!\u201d Stournaras said.\n\n\u201cDo you hear me, Yannis?\u201d\n\n\u201cYeah, yeah, I do.\u201d\n\n\u201cAm I correct in this assessment?\u201d\n\n\u201cAbsolutely.\u201d\n\nThen Samaras quoted Neil Diamond. \u201cYou know that song that says, \u2018Used-to-bes don\u2019t count anymore, they just lay on the floor till we sweep them away?\u2019\u201d he said. \u201cThe idea is that differences don\u2019t matter as long as there is a common cause that links us together.\u201d\n\nAs Samaras spoke, Stournaras smiled appreciatively. Despite this display of seemingly genuine affection, it was hard for me to forget what I\u2019d been told a day earlier: that for all this friendship, Stournaras could yet prove dispensable. Note: We got a lot of feedback from our readers after we published Seven of the Best Coffee Shops in Michigan last month. So, we\u2019ve brought the subject back and sent Erica Starr, a Michigan coffee expert, on a tour of the mitten to compile her list of recommendations. This is her first installment.\n\nA cup of coffee, to many, is a simple pleasure. To some, however, that cup of coffee is much more than just a drink; it represents passion for an ever-developing craft. Similar to beer or wine, coffee is a complex beverage that holds the capability to produce a delicious"} +{"output_text": ". We have to build a new kind of organization that can be a force for unity and not division.\n\nThe CPN-M is a good example of this kind of unity. It\u2019s a party that has been around for a long time, and it\u2019s been able to build a strong base of support among the working class. It\u2019s also been able to build a strong base of support among the oppressed, including women, Dalits, and ethnic minorities. It\u2019s", "input_text": ", you can only achieve unity across differences. Identity is more about sameness\u2014the idea that we can define a group of people coherently against other groups of people\u2014but unity is about working through differences to build a political programme. We need this kind of unity, one that doesn\u2019t ignore power differences amongst subordinated groups, but respects them and works through them, to build new ways of organizing production, distribution, and living in general.\n\nAn example is the work of the Communist Party of Nepal \u2013 Maoist (CPN-M), which I had a couple of opportunities to look at up close. There are different revolutionary organizations of students, of women, of workers, of peasants, of indigenous peoples, of ethnic and religious groups, and so on, each of which carry out political work in their respective fields. The Maoists also have a cultural organization and an armed wing. These different groups are coordinated by the CPN-M so that they can work together to produce revolutionary change. The All Nepal Women\u2019s Association (Revolutionary) or ANWA(R), for example, focuses on issues that affect women, especially combating domestic violence, polygamy, alcohol abuse, and so on\u2014often, this means that working class women are challenging working class men. But ANWA(R) is plugged into the broader revolutionary analysis and line of the CPN-M. They recognize that the fundamental causes of gender oppression are linked to the state and class structure in Nepal, and that all of that has to be changed in a revolutionary way. Respecting the diversity of focuses, they still work together for the ultimate goal. That\u2019s what I mean when I stress unity, one that doesn\u2019t deny difference but works through it.\n\nIn other words, we have to counter the magnificently organized apparatus of the capitalists and the state with our own magnificent organization"} +{"output_text": " same impact as the other shows.\n\nI\u2019m not sure if the show is going to be a hit or not. I\u2019m not sure if the show is going to be a hit or not. I\u2019m not sure if the show is going to be a hit or not. I\u2019m not sure if the show is going to be a hit or not. I\u2019m not sure if the show is going to be a hit or not. I\u2019m not sure if", "input_text": " with jokes. They seem committed to the anti-humor sketches, but they might have a hit on their hands by including interviews like the ones they did for their YouTube channel. It will have them be more comparable to Eric Andre, but people [aimeemccarthy being one] already accuse the troupe of ripping off Tim and Eric. The YouTube content and the goofs in the second episode show that the show can be its best if they extemporize a bit more.\n\nMy first impression of the show with its first episode was that the writers mistook incomprehensibility as a form of asserting intellectual superiority.\n\nThe big issue here is that at points in watching MDEs material online and on [adult swim], my mind develops a filter that puts fedoras on the men. I have been to heavily influenced by Internet femenism and seen to much of what Internet femenism makes fun of in the real life to have much patience for guys who are too cool to be funny on a comedy. Chris D\u2019Elia drained me of that patience.\n\nThe fandom and the show are serious about taking nothing seriously. Take a look at the reviews of the show on IMDB.\n\nI feel required to sigh. I feel like this could have been written by a number of guys I go to school with. [adult swim] has been called out for the lack of diversity when it comes to the creators of its shows, nearly all of them are white males. Not a single female. With Brad Neely\u2019s Harg Nallin\u2019 Sclopio Peepio already the role of the show that\u2019s \u201ctoo cool and alt comedy to be funny\u201d has been filled. The 2015-2016 TV season was not a good year for sketch comedy so far. Harg Nallin\u2019 hasn\u2019t had the"} +{"output_text": " the Sharktooth Hill Bonebed [106].\n\nFig. 25\n\nBonebed 2 (Bonebed 2, 25). (a) Photograph of the bonebed. (b) Photograph of the bonebed showing the bioclasts and clasts. (c) Photograph of the bonebed showing the bioclasts and clasts. (d) Photograph of the bonebed showing the bioclasts and clasts. (e) Photograph of the bonebed showing the", "input_text": ", 25). Despite lacking an erosive base, Bonebeds 2 and 4 exhibit loosely packed bioclasts/clasts, and Ophiomorpha burrows that extend up to 2 meters below the \u03b2-interval that are infilled with densely packed bonebed debris. These bonebeds also exhibit clear trace fossils within the bonebed. These data indicate that certain bonebeds have been bioturbated and biologically mixed, directly modifying their internal architecture; such biologically mixed concentrations are difficult to interpret (e.g., [73]). The presence of coarse bonebed debris infilling burrows below bonebeds indicates bioturbating invertebrates were able to transpose clasts and bioclasts up to 5 cm in length, and up to 2\u20133 meters below bonebeds (e.g., Fig. 26). This indicates that the architecture of a bioclastic accumulation (when bioturbated) may be misleading when applying the bioclastic concentration model of Kidwell [1]. Bonebeds 2 and 4 both contain a large amount of phosphatic nodules and phosphatized bioclasts, indicating that seafloor erosion was a factor in its formation, although the lack of a clearly preserved scour means its architecture would be interpreted as a hiatal concentration in Kidwell\u2019s [1] scheme (Fig. 25). This has implications for the interpretation of other marine vertebrate bonebeds; for example, Pyenson et al. [105] interpreted the middle Miocene Sharktooth Hill Bonebed as a hiatal concentration rather than a lag concentration due to the lack of evidence of erosion. However, it is possible that an erosional scour was present at some stage, and subsequently erased by bioturbators (Fig. 27); this possibility is borne out by the abundance of fragmented and otherwise taphonomically damaged skeletal elements reported from"} +{"output_text": " be related to each other in a systematic way. These chapters could be organized in a hierarchical fashion, with the most common disorders at the top and the rarest at the bottom. The chapters could be further organized into families of disorders that are related to each other in a systematic way.\n\nThe DSM-5, in its current form, does not have a chapter on autism spectrum disorders, but it does have a chapter on pervasive developmental disorders, which includes autism spectrum disorders. The chapter on pervasive", "input_text": " function, from cognition to emotion to behavioral control, and that these circuit abnormalities do not respect the narrow symptoms checklists within the DSM.\n\nThe first DSM had many important strengths, but I would argue that part of what went wrong with it was a fairly arbitrary decision: the promulgation of a large number of disorders, despite the early state of the science, and the conceptualization of each disorder as a distinct category. That decision eschewed the possibility that some diagnoses are better represented in terms of quantifiable dimensions, much like the diagnoses of hypertension and diabetes, which are based on measurements on numerical scales.\n\nThese fundamental missteps would not have proven so problematic but for the human tendency to treat anything with a name as if it is real. Thus, a scientifically pioneering diagnostic system that should have been treated as a set of testable hypotheses was instead virtually set in stone. DSM categories play a controlling role in clinical communication, insurance reimbursement, regulatory approval of new treatments, grant reviews, and editorial policies of journals. As I have argued elsewhere, the excessive reliance on DSM categories, which are poor mirrors of nature, has limited the scope and thus the utility of scientific questions that could be asked. We now face a knotty problem: how to facilitate science so that DSM-6 does not emerge a decade or two from now a trivially revised descendant of DSM-III, but without disrupting the substantial clinical and administrative uses to which the DSM system is put.\n\nI believe that the most plausible mechanism for repairing this plane while it is still flying is to give new attention to overarching families of disorders, sometimes called meta-structure. In previous editions of the DSM, the chapters were almost an afterthought compared with the individual disorders. It should be possible, without changing the criteria for specific diagnoses, to create chapters of disorders that co-occur at very high rates and that appear to"} +{"output_text": " that it\u2019s not a question that can be answered. It\u2019s not a question that can be answered by anyone]\n\n[Subaru: \u2015\u2015\u2015\u2015]\n\n[Echidona: I\u2019m sorry, but I can\u2019t answer it. I can\u2019t answer it because I don\u2019t know. I don\u2019t know because I don\u2019t know. I don\u2019t know because I don\u2019t know. I don\u2019t know because I", "input_text": "uing your mind, are ones which only the Witch of Envy could answer]\n\n[Subaru: \u2015\u2015\u2015\u2015]\n\n[Echidona: You can mull over them endlessly, but, in all honesty, I doubt you will ever reach an answer. Not about why she pursued you back then, nor about the \u201cPresents that may or may not exist\u201d]\n\n[Subaru: Th..at\u2019s\u2026\u2026]\n\nTo Subaru, that would be far too cruel a reality.\n\nHe wanted to hear it clearly refuted. To be told that the worlds beyond his death never existed.\n\nOr if not, then at least he wanted to hear it outright. That \u201cSo many had been sacrificed for your conceit.\u201d\n\nWhichever the answer, Subaru would have taken it as his admonition, his creed, his reminder to never forget, and though he\u2019d grit his teeth, shed tears of blood and cry out from his very soul, he would turn his steps forward.\n\n\u2015\u2015But for the answer to be \u201cThere is no answer\u201d, isn\u2019t that just far too cruel?\n\nWas he to live, without confirmation or denial, leaving the fate of worlds in this indeterminate limbo?\n\nTo go on without knowing whether his steps were his own. Whether he had abandoned what he had abandoned. Whether his sins were sins. Was this to be his punishment?\n\nWere Natsuki Subaru\u2019s crimes so great that no one could ever forgive him?\n\nNo one was capable of passing judgement on Subaru. No one could condemn him, either. He already understood this.\n\n\u2015\u2015But was even Subaru himself to be denied that right?\n\n[Echidona: I do think it\u2019s harsh. But I also think"} +{"output_text": " few seconds, the add-on will be installed.\n\nStep 4: Now, you can simply click on the icon to download the video.\n\n#3 Download JW Player Videos using Google Chrome\n\nDownloading videos attached on web pages with the help of extensions and add-ons is an old method that actually works. Well, this is true in case of downloading jw player videos. With just an add-on, you can download jw player videos google chrome effortlessly", "input_text": " You can directly open it after landing on a web page by simply pressing Ctrl + Shift +I.\n\nStep 3: Then the element page will be opened. You will certainly face similar to the following capture.\n\nStep 4: Now move your cursor on the video which will cover that space with blue. And you will able to see the source link to download it. For a clear view, refer to the following image.\n\nStep 5: Copy that url and paste it in a new tab. Or, you can Click on Save as and you will be asked to select a path to save it to a specific location.\n\nFor Mozilla Firefox\n\nStep 1: Open the website which has jw player hosted videos and play it for a while.\n\nStep 2: Right-click on that web page and choose \u201cView page info\u201d\n\nStep 3: Now jump to the media section from the upper mega menu.\n\nStep 4: Scroll down a bit to find a video file from the list.\n\nStep 5: Select save as and choose a location where you want to place downloaded video.\n\n#2 Download JW Player Videos using Firefox Add-On\n\nDownloading videos attached on web pages with the help of extensions and add-ons is an old method that actually works. Well, this is true in case of downloading jw player videos. With just an add-on, you can download jw player videos firefox effortlessly. Follow these steps to do so:\n\nStep 1: All you need to have is firefox browser. If you don\u2019t have, install it from here.\n\nStep 2: Now navigate to Firefox > Add-ons. You can simply jump to this navigation using Ctrl+Shift+A. And, search for Flash Video Downloader.\n\nStep 3: Click on Add to Firefox and within a"} +{"output_text": "]: Failed password for root from 125.39.22.154 port 59660 ssh2 Jan 26 03:46:02 host sshd[22731]: Received disconnect from 125.39.22.154: 11: Bye Bye [preauth] Jan 26 03:46:02 host sshd[22731]: Disconnected from 125.39.22.154 Jan 26 03:46:02 host sshd[22731]: PAM 2 more authentication failures;", "input_text": " the Associated Press.\n\nI think most Americans know that immigrants are a net positive to our country, and they always have been. It\u2019s why so many of us get teary-eyed when we recall the sacrifices and struggles of our own immigrant ancestors. The problem is that pride, ignorance, or a combination of both stops too many of us from connecting what the were like and what they went through back when with the immigrants arriving now and what they\u2019re experiencing.\n\nWe could make that connection if not for three things: fear, change, and competition. We fear the unknown. We dread change. And we don\u2019t welcome competition. Immigrants represent all three, and so their continued arrival on our shores brings a fair amount of anxiety\u2014whether they come legally, illegally, or with a letter of recommendation from the Queen of England.\n\nIt\u2019s that anxiety that gives Miller his power. He knows it\u2019s there, and he taps into it with every idiotic and inhumane policy idea. His goal is clear\u2014and now out in the open for all to see: to bleach the U.S. population, and make America white again.\n\nThat may be evil, but it is also Miller's endgame. And we can't let him get away with it, or pretend we didn\u2019t know what he was up to all along. Detecting a SSH Brute Force Attack\n\nIf you are under a SSH brute force attack, you will likely see something like this in your logs.\n\nJan 26 03:46:02 host sshd[22731]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=125.39.22.154 Jan 26 03:46:02 host sshd[22731"} +{"output_text": ":\n\nBut the problem with this kind of thinking is that it\u2019s not just a matter of aesthetics. It\u2019s also a matter of ethics. The problem with the idea that democracy is a good thing is that it\u2019s not a good thing. Democracy is a bad thing. It\u2019s a bad thing because it\u2019s a bad thing.\n\nThe problem with democracy is that it\u2019s a bad thing because it\u2019s a bad thing.\n\nAdvertisement:\n", "input_text": " get into that on the next page. Of U.S. political culture\u2019s many hypocrisies, few are more jarring than Americans\u2019 ambivalence about democracy itself. Truth be told, despite its reputation as \u201cthe leader of the free world\u201d and its history as the \u201carsenal of democracy,\u201d America is a land where democracy is celebrated only in its most abstract and idealized form. Most everyone agrees that government of the people, by the people, for the people sounds pretty great. But when the reality of that principle is revealed \u2014 when all the happy talk of the greater good and the public will is replaced by the prosaic, undignified tedium of actual self-governance \u2014 millions of Americans, on both the left and the right, find themselves so disillusioned that they either reject politics entirely or, worse still, embrace an ideology so rigid and utopian as to serve as a kind of secular faith.\n\nOnce you\u2019ve noticed it, Americans\u2019 discomfort with the grit and grime of real-world democracy can at times feel omnipresent. Take Frank Capra\u2019s beloved 1939 film, \u201cMr. Smith Goes to Washington,\u201d in which Jimmy Stewart\u2019s naive but idealistic Jefferson Smith is able to overcome the corruption and rancor of the U.S. Senate not through negotiation and compromise but because of his indomitable will, evidenced by his decision to filibuster to the point of exhaustion. Or to look at this pathology from the opposite end of the telescope, note how Netflix\u2019s popular \u201cHouse of Cards\u201d series acknowledges the myriad trades and settlements of democratic governance but, through Frank Underwood, a protagonist who is both a master politician and a ruthless sociopath, presents this mode of behavior as fundamentally immoral and corrupt. The good guy keeps on fighting; the bad guy cuts a deal.\n\nAdvertisement"} +{"output_text": "ata di sinistra.\n\nInoltre, la notizia di un\u2019associazione cattolica che diffonde la verit\u00e0, non \u00e8 stata ripresa da altri giornali, come il Corriere della Sera, che ha pubblicato un articolo sulla vicenda, ma solo da Buzzfeed.\n\nInoltre, la notizia di un\u2019associazione cattolica che diff", "input_text": " RTV), o il ferimento di una vigilessa di Salerno in una rissa tra immigrati (notizia apparsa anche su Repubblica), o ancora le tensioni a Sassari tra migranti e popolazione locale. Da quello che si pu\u00f2 verificare consultando le agenzie e i giornali, sono tutte notizie di fatti realmente accaduti (prese da altri giornali, tradotte o riportate) ma con titoli fatti apposta per suscitare indignazione nel lettore.\n\nMa su Inews24.com ci sono anche delle notizie false riprese da altre testate, come il caso del migrante ubriaco che ha aggredito un'infermiera. Non a caso Laura Boldrini, anche lei oggetto di articoli della holding, descrive quella di Buzzfeed come un'inchiesta \"Che rivela come milioni di cittadini italiani siano ogni giorno vittima di informazione spazzatura\".\n\nBuzzfeed ha inoltre raccontato di un legame del signor Colono con un\u2019associazione cattolica, La Luce di Maria, \u201cla cui missione \u00e8 quella di diffondere la verit\u00e0 nel mondo\u201d, su politica, religione e societ\u00e0. Un\u2019associazione che non sembra essere collegata per\u00f2 alle sue attivit\u00e0 di imprenditore, quanto piuttosto al suo credo religioso. Il sito Lalucedimaria.it ha contenuti palesemente religiosi, in alcuni casi antiscientifici, ma non \u00e8 una test"} +{"output_text": ", Srebrenica was a place of refuge for the Serb population of Bosnia. The Serb population of Srebrenica was not a 'safe haven' for the Muslims. The Serb population of Srebrenica was a 'safe haven' for the Serbs of Bosnia. The Serbs of Bosnia were not a 'safe haven' for the Muslims. The Serbs of Bosnia were a 'safe haven' for the Serbs of Bosnia. The Serbs of Bosnia were", "input_text": ", Albanians, Macedonians, Turks, Hungarians, Gorans...In reality, contrary to the image given by the press, Serbia is today the only state of the ex-Yugoslavia, along with Macedonia, that remains'multinational'. On the other hand, all the NATO protectorates - Croatia, Bosnia and Kosovo - practiced an almost total ethnic purification.Milosevic objected to the excesses committed by the Serb militias in Bosnia. His wife made several declarations against them.Did the media correctly report on Srebrenica?NO. First element. Even if it's a matter of condemning abominable crimes, historical truth - necessary for reconciliation - is not served by the propagandistic processes that unreflexively use the term 'genocide', by the obfuscation of the fact that that some of the victims died in combat or by the systematic exaggeration of the numbers.This information was and remains obscured. We won't here go into the argument over numbers which only serious historians will be able to sort out definitively.Second element. Why did the media hide the events essential to an understanding of this drama? In the beginning, this region was inhabited by Muslim AND Serbs.. French general Morillon, who commanded the UN force there, charges: \"On the night of the Orthodox Christmas, the holy night of January 1993, Nasser Oric led raids on Serb villages.... There were heads cut off, abominable massacres committed by the forces of Nasser Oric in all the neighboring villages.\" (Documents of information from the French National Assembly, Srebrenica, t 2, pp. 140-154) The desire for vengeance does not excuse the crimes committed later. But why systematically hide the crimes of 'our friends'?Third element. Like other so-called demilitarized'safe havens"} +{"output_text": " om hans talent. Han var en av de f\u00e5 som kunne spille gitar p\u00e5 en gitar som var laget av en gitarist fra USA.\n\nP\u00e5l var en av de f\u00e5 som kunne spille gitar p\u00e5 en gitar som var laget av en gitarist fra USA.\n\nP\u00e5l var en av de f\u00e5 som kunne spille gitar p\u00e5 en gitar som var laget av en gitarist fra USA.", "input_text": " ta imot behandling. Vi viste til alle som har klart \u00e5 komme seg ut av dette helvetet, til alle hans ressurser, alt han var god p\u00e5, men ingenting hjalp. Selv n\u00e5r vi ba ham om \u00e5 tenke p\u00e5 oss og resten av familien, hjalp det ikke.\n\nForeldrene forteller om overdoser, b\u00e5de med vilje og ved uhell. De har begge funnet ham s\u00e5 ruset at de har ringt 113 og f\u00e5tt hjelp. De har g\u00e5tt gatelangs og lett etter s\u00f8nnen, de har ringt hoteller og spurt etter ham, og de har meldt ham savnet hos krimvakta. De har sovet p\u00e5 skift.\n\nI nedtrappingsperioder kunne faren selv kj\u00f8re P\u00e5l ut for \u00e5 kj\u00f8pe heroin \u2013 for \u00e5 kontrollere at han bare fikk de dosene han og foreldrene hadde avtalt. Grensene ble flyttet. Han var med p\u00e5 noe som han aldri i livet skulle tro var mulig: hjelpe s\u00f8nnen med \u00e5 kj\u00f8pe heroin. Det er ulovlig. Og, mener foreldrene selv, moralsk forkastelig. Men h\u00e5pet om forandring var for stort.\n\nDet var ikke hver dag de merket at s\u00f8nnen var preget av heroin. Det var ogs\u00e5 gode dager med samtaler og hygge. P\u00e5l satt lange perioder p\u00e5 rommet og lastet opp gitarvideoer p\u00e5 Youtube. Der skryter nettbrukere fra hele verden"} +{"output_text": " Venture Bros. at their most vulnerable, it also has the Venture Bros. at their most hilarious.\n\nSeason 4: Episode 17, \"The Biggest Loser.\" The Venture Bros. is a show that's always been about the characters, and Season 4 is no different. The episode is a study in the characters' strengths and weaknesses, and it's a study in the characters' strengths and weaknesses. Brock Samson is a man who's been through hell and back,", "input_text": " got caught up in a Quiz Show-style scandal (that wasn't his doing). The fallout sends Quizboy down a path that includes a depressed bender, alterations including an eye-patch and bionic hand, OSI recruitment, and a mission to infiltrate the Guild through Professor Fantamos (better known later as Phantom Limb). The Venture Bros. leaves much of its world to viewer imagination, but seeing Quizboy's bizarre journey (not to mention how Brock Samson landed his gig with Dr. Venture) only makes the show's current day escapades more enjoyable. For a Quizboy fandom hour, pair this with Season 5's \"Where\u2019s Your Cleansuit?\" to watch a winner-take all trivia showdown between the boy wonder and his new arch, former-trivia-competitor-turned-wealthy-pop-culture-collector Augustus St. Cloud.\n\nSeason 4: Episode 16, \"Operation P.R.O.M.\" Again, every end-of-season special could be on this list (with many fans considering Seasons 2 and 3 to have the best). But \"Operation P.R.O.M.\" reigns supreme. Even if you're the rare Venture fan who feels Season 4 is a slog, push on through. The premise is simple\u2014the Venture brothers are homeschooled, Dr. Venture decides to give them a true prom\u2014but everything else is intricate. All of the time spent depicting these characters as a bit hopeless comes to a head as everyone from the indestructible Brock Samson to the Monarch fails in the most hilarious yet heartbreaking of ways (yes, the tired \"hire a bunch of call girls as dates\" thing happens... but they're actually planted by the Guild as killing machines). While the episode has the"} +{"output_text": "\nThe bezels are a bit on the large side, but they\u2019re not too bad. They\u2019re not as large as the bezels on the Surface Pro 4, but they\u2019re not small either.\n\nThe bezels are a bit larger than the bezels on the Surface Pro 4, but they\u2019re not too bad.\n\nThe display is a 13.5-inch, 2160 x 1440, IPS LCD. It\u2019s a pretty good display", "input_text": " still easy to hold with one hand, and won\u2019t be particularly noticeable in your bag.\n\nThe Surface Go has a standard assortment of ports and buttons; a power button and volume rocker; a Surface Connect port, USB-C port, MicroSD slot, and a Surface Type Cover port.\n\nI\u2019m very glad that they embraced the future and put a USB-C port on here. One of my main complaints from that Surface Laptop was that it didn\u2019t\u2019 have a USB-C port.\n\nI can understand that for some people that will be an annoyance, but it\u2019ll be good in the future.\n\nI do kind of wish that they would switch the Surface Connect port to a USB-C port as well, since either port can be used for charging but that\u2019s just me being nitpicky.\n\nThere are two cameras, front and back. The front utilizes the \u201cHello\u201d login feature, that I gladly welcome. As someone that uses an iPhone X, I love that I can also just unlock the Surface with my face.\n\nIt\u2019s also a pretty damn good webcam. If you tend to make Skype calls or video conference often, you\u2019ll be pleased to know that your face will be crystal clear to the other side.\n\nThe back camera is basically there to be there; it\u2019ll suffice for quick shots, but your photos won\u2019t make it to National Geographic.\n\nThe speakers are exactly as good as you imagine they are. Perhaps even a bit better; they\u2019re perfectly acceptable for watching movies and listening to music, but it won\u2019t be the center of your house party.\n\nDISPLAY\n\nFinally, we\u2019re at the display, but before we jump into the details of the display, we should acknowledge the bezels that surround it.\n"} +{"output_text": " poisoning, and the list goes on and on.\n\n2. Animal Welfare:\n\nThe Torah is very clear that we should not eat animals that are not properly slaughtered, and that we should not eat animals that are not properly treated. The Torah also teaches that we should not eat animals that are not properly fed, and that we should not eat animals that are not properly housed.\n\nThe Torah also teaches that we should not eat animals that are not properly cared for, and", "input_text": " Jewish Vegetarian Videos\n\n20. Related Jewish Organizations\n\n21. Kosher Vegetarian Organizations\n\n22. Miscellaneous Jewish Vegetarian Resources\n\n23. Kosher Vegetarian Restaurants\n\n24. Kosher Vegetarian Caterers\n\n25. Free Vegetarian Starter Kits\n\n26. English-Hebrew\n\n27. Translations?\n\n\n\n1. Personal Health & Safety:\n\nHealth and the protection of life are repeatedly emphasized, and even prioritized, in Jewish teachings. While Judaism teaches that we should be very careful about sh\u2019mirat haguf, preserving our bodies and health, and pekuach nefesh, protecting our lives at almost any cost, numerous scientific studies have linked animal-based diets directly to heart disease and heart attacks (the #1 cause of death in the U.S.), various forms of cancer (e.g., lung, colon, breast, prostate, stomach, and pancreas) (the #2 cause of death), stroke (the #3 cause of death), high blood pressure, obesity, diabetes, osteoporosis, asthma, atherosclerosis, aneurysms, rheumatoid arthritis, impotence, endometriosis, gallstones, gout, Alzheimer\u2019s, and various other very serious ailments. About 2/3 of diseases in the U.S. are diet-related\u2014and vegetarians are much less afflicted. Dayeinu.\n\nFurther, since more than half of all antibiotics in the U.S. are given to livestock (plus immense amounts of chemicals, steroids, hormones, and other drugs), resistant bacteria are increasing at an alarming rate, creating untreatable superbugs, like MRSA, that kill tens of thousands of people per year. And don\u2019t forget mad cow disease, bird flu, foot and mouth, e. coli, salmonella and food"} +{"output_text": ", and Jews were not allowed to apply.\n\n\"The prime minister, William Lyon Mackenzie himself, bought all the lands surrounding his house because he did not want Jews to become his neighbours,\" Stone said.\n\n\"He was a very wealthy man and he was very concerned about the Jewish people.\"\n\nThe Jews were not welcome in private clubs like the Puffin Ski Club. (Legislative Library of Manitoba)\n\nThe Jews were also not allowed to buy or", "input_text": "?\"\n\n'They were going to ruin the place'\n\nVictoria Beach was not the only place where Jews were discriminated against in Manitoba \u2014 most of the resorts in the eastern part of Lake Winnipeg excluded them, confirms Daniel Stone, a retired professor of history at the University of Winnipeg.\n\n\"The fanciest places in Manitoba were owned by the Anglo establishment,\" he said. \"They were so-called gentlemen's agreements not to sell to Jews because people thought they were going to ruin the place.\"\n\nStone adds that the most famous example of this is Victoria Beach because, at the time, the Winnipeg Free Press editor, John Dafoe, denounced it.\n\n\"The Nazis of Europe are making it plain to the Jewish people that they would not live with them. Here, in Manitoba, the summer residents of Victoria Beach are engaged in a similar crusade,\" Dafoe wrote on August 17, 1943.\n\nThe prime minister, William Lyon Mackenzie himself, bought all the lands surrounding his house because he did not want Jews to become his neighbours. - Belle Jarniewski\n\nIn response, Stone said the editor of the Victoria Beach Herald stated that his opinions \"were much more moderate\" than the ones of other people who lived in Victoria Beach.\n\nThe Jews' exclusion was not restricted to vacation spots. In Winnipeg, they could not buy or rent properties in Tuxedo or around Wildwood Park, according to Stone. They were also not welcomed in private clubs like the Puffin Ski Club.\n\nThe Victoria Beach Herald did not specifically mention the Jews as unwanted people, but at the time, it was obvious according to Daniel Stone. (Legislative Library of Manitoba)\n\nAside from that, the University of Manitoba School of Medicine was also screening its applications. An official quota system was adopted in 1932"} +{"output_text": " is een film over de ramp met de vliegtuig MH17.\n\nDe lijsttrekker heeft zin in de campagne en denkt dat drie Kamerzetels haalbaar is en gaat uit van minstens \u00e9\u00e9n. \u201cHet is vooral zaak onze achterban te mobiliseren en dat we ons verder richten op mensen die nu niet stemmen omdat ze teleurgesteld", "input_text": " maanden weggelegd voor de in juni gekozen lijsttrekker Ancilla van de Leest. Zij kreeg een paar weken geleden haar vuurdoop toen het partijbestuur uit onvrede over haar leiderschapsstijl opstapte en lid werd van Forum voor Democratie, de nieuwe partij van publicist Thierry Baudet. Van de Leest kreeg het verwijt het partijbestuur buitenspel te zetten en alles binnen de partij met haar eigen vertrouwelingen te bepalen. De lijsttrekker wuift de kritiek op het congres weg. \u201cZij zijn gevraagd te vertrekken, we hebben dit probleem met z\u2019n allen opgelost. Deze crisis is achter de rug\u201d, aldus Van de Leest in een interview op het congres.\n\nDe lijsttrekker heeft zin in de campagne en denkt dat drie Kamerzetels haalbaar is en gaat uit van minstens \u00e9\u00e9n. \u201cHet is vooral zaak onze achterban te mobiliseren en dat we ons verder richten op mensen die nu niet stemmen omdat ze teleurgesteld zijn in de huidige politiek.\u201d Kanne geeft de mediagenieke Van de Leest een goede kans. \u201cHet is wel zaak de Piraten nu de rijen sluiten, je moet niet blijven ruzi\u00ebn. Anders kun je als kleine partij wel inpakken.\u201d The Ghosts of Flight 191"} +{"output_text": ", which says, \u201cFor this cause God gave them up unto vile affections: for even their women did change the natural use into that which is against nature: And likewise also the men, leaving the natural use of the woman, burned in their lust one toward another; men with men working that which is unseemly, and receiving in themselves that recompence of their error which was meet.\u201d\n\n\u201cI\u2019m not saying that\u2019s the only reason, but it\u2019", "input_text": " months ago.\n\nIf there was one message in the massacre, it seemed to be that LBGT people are still not safe, and that religious teachings -- or at least a narrow reading of them -- may be a contributing factor to hatred against gays.\n\nReligious leaders from Pope Francis to the Florida chapter of the Council on American-Islamic Relations sharply condemned the shooting.\n\nThe Vatican's spokesperson, the Rev. Federico Lombardi, said Pope Francis shares in the victims' \u201cindescribable suffering\u201d and \u201che entrusts them to the Lord so they may find comfort.\u201d\n\nMuslim groups also condemned the killings.\n\n\"The Muslim community joins our fellow Americans in repudiating anyone or any group that would claim to justify or excuse such an appalling act of violence,\" read a statement from the Council on American-Islamic Relations. The Florida chapter also called on the Muslim community to take part in a blood drive for those wounded in the attack.\n\nBut such words from religious groups provided cold comfort to many gay activists.\n\n\u201cThere\u2019s such a cognitive dissonance for me when public officials ask us to pray when the majority of world religions promote anti-LGBT theology,\u201d said Eliel Cruz, executive director of Faith in America, an organization that attempts to end the harm to LBGT youths it says is caused by religious teachings.\n\n\u201cThis isn\u2019t isolated to Muslim beliefs. It\u2019s seen in Christianity and it\u2019s just as deadly,\u201d added Cruz, a former RNS columnist.\n\nJust last month in Congress, Rep. Rick W. Allen, from Georgia\u2019s 12th District, led a Republican policy group\u2019s opening prayer by reading Bible passages that condemn homosexuality and those \u201cwho by their unrighteousness suppress the truth.\u201d\n\nAllen read from Romans 1:28-32"} +{"output_text": " law to register as sex offenders, and those who were convicted of crimes, including murder, rape, and child molestation.\n\n

    ICE arrests more than 100 illegal aliens in California#ICE <", "input_text": "?ref_src=twsrc%5Etfw\">@realDonaldTrump @POTUS #Trumpville \ud83c\uddfa\ud83c\uddf8 https://t.co/EtIgblivll

    \u2014 GunLovinTrumpGirl\ud83d\udc60 (@lmchristi1) March 17, 2018
    \n\n\n\nICE said that the most recent series of sweeps during the three-day operation was meant to \u201ctarget the public safety threats\u201d posed by these illegals, in which ICE agents are now having to commit to \u201cold-fashioned detective work\u201d on the ground since California is unwilling to assist in their operations.\n\nIntelligence gathering, interrogation of individuals and businesses who would hire illegals over American taxpayers, and other tactics must now be used by ICE to hunt down the most dangerous of illegals in the rogue state of California, which has infuriated residents of the state at the Southern California leadership, for their desire to ignore the demands of the rest of the state's citizens and instead promote a lawless society of corruption.\n\nAlso targeted during the sweeps were individuals who were required by"} +{"output_text": " Manchester City to sign the most expensive player in the world, Kylian Mbapp\u00e9, from Monaco. Manchester United, who face Chelsea in the other semi-final, have signed the most expensive player in the world, Paul Pogba, from Juventus.\n\nArsenal, who have won the competition three times, have also signed the most expensive player in the world, Mesut \u00d6zil, from Real Madrid.\n\nThe Arsenal manager, Unai Emery", "input_text": " 4 pounds even is accurate, as advertised.\n\nMy one real issue with the gun is how they got that trigger pull. I like triggers that have a set mechanical stop, then you apply pressure and they break, and the gun shoots. The CZ P-10 C trigger has what feels like a stop point, then some more movement without a stop behind it, then the gun shoots. This is a subtle movement, but it is present. I would guess 99.9% of the shooters that pick up this gun won\u2019t even notice. To be fair, on the range, I barely noticed, too. The gun is incredibly accurate, and the speed drills I was shooting with this gun speak for themselves. It handles well, it is easy to control, and the trigger isn\u2019t bad. I for one would really like to shoot this gun in.40 S&W, see how it tames that beast. At a street price of under $450, this CZ is absolutely a bargain.\n\nFor more information, visit http://cz-usa.com/product/cz-p-10-c/.\n\nTo purchase a CZ P-10 C on GunsAmerica.com, click this: CZ P-10 C. Rarely has the Emirates Stadium felt so barren or bitter, yet so full of promise. On a Monday night where temperatures arrived by express delivery from the Siberian steppe \u2013 one CSKA Moscow player even wore leggings \u2013 Arsenal became the third English side to reach the last four of the NextGen Series, the unofficial Champions League for under-19 teams, with a comfortable 1-0 win in front of nearly 7,000 people.\n\nIt was the latest show of strength in the futures market from an English side. Chelsea, who face Arsenal in the semi-finals on Good Friday, recently beat"} +{"output_text": ". Let\u2019s add a list of posts to the home page.\n\n(def posts (atom [])) 1 2 3 4 5 6 7 8 ( def posts ( atom [ ] ) )\n\nThe posts are stored in the atom, so we can add a post to the list by calling the add function.\n\n(defn add-post [post] (swap! posts conj post)) 1 2 3 4 5 6 7 ( defn add-post [ post ] (", "input_text": " function. So we use delegate to pass the request as the first argument.\n\n(ns clog.core (:use ring.adapter.jetty ring.middleware.resource ring.middleware.reload ring.util.response net.cgrand.moustache clog.controller)) ;; Routes definition (def routes (app [\"\"] (delegate index))) ;;; start function for starting jetty (defn start [port] (run-jetty #'routes {:port (or port 8080) :join? false})) (defn -main [] (let [port (Integer/parseInt (System/getenv \"PORT\"))] (start port))) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ( ns clog. core ( : use ring. adapter. jetty ring. middleware. resource ring. middleware. reload ring. util. response net. cgrand. moustache clog. controller ) ) ;; Routes definition ( def routes ( app [ \"\" ] ( delegate index ) ) ) ;;; start function for starting jetty ( defn start [ port ] ( run-jetty #'routes { : port ( or port 8080 ) : join? false } ) ) ( defn -main [ ] ( let [ port ( Integer / parseInt ( System / getenv \"PORT\" ) ) ] ( start port ) ) )\n\nStart the server as shown below, and you should see the home page with title in the window title bar \u201cClog \u2013 the clojure blog engine!\u201d.\n\n(use 'clog.core) (start 9000) 1 2 ( use'clog. core ) ( start 9000 )\n\nList of Posts\n\nThe home is working, but it is rather boring and doesn\u2019t show any real data"} +{"output_text": " she worked on Hillary Clinton\u2019s campaign.\n\n\u201cI\u2019m a progressive Democrat, but I\u2019m also a pragmatist,\u201d Davis said. \u201cI\u2019m a progressive Democrat who believes in the power of the people to make change.\u201d\n\nDavis is a member of the Democratic Socialists of America, a group that has been growing in popularity in the United States. She is also a member of the Democratic Party.\n\n\u201cI\u2019m a progressive Democrat who believes", "input_text": " believes, means avoiding discussion of important ideas that she wants to bring into the light.\n\nThe 8th District, which covers part of Omaha, is currently served by Sen. Burke Harr, who is ineligible for re-election due to term limits. It is comprised mostly of working-class, low-income people and families \u2014 people who Davis says don\u2019t have the time or resources to represent themselves and their interests in the state and national government. She aims to be their voice.\n\nOn Christmas Day, Davis\u2019 office received a \u201cglitter bomb,\u201d or an envelope stuffed with glitter that\u2019s designed to make a mess when opened. Though she was able to contain the glitter, Davis expressed concern to the police that hate mail could be next. \u201cI\u2019m a young woman of color \u2026 It\u2019s the age of Trump and people have gotten much more malicious to express their displeasure,\u201d she told a local news channel.\n\nDavis was raised in a family that drilled into her the importance of being an active citizen. Her father is a veteran who served 6 years in the Air Force, and he strove to keep Davis and her sisters politically informed and enthusiastic about their educations. Davis\u2019s mother is an immigrant who obtained United States citizenship when Davis was a young girl. Davis remembers helping her mother study for the citizenship test \u2014 an experience that impressed upon her the value of being a citizen of this country and planted a desire serve.\n\nRelated: VoteRunLead Wants Women in Power, Regardless of Party\n\nThis is the recent college graduate\u2019s first run for office. She credits the 2008 Nebraska Democratic caucus between Barack Obama and Hillary Clinton, which she attended with her father, as the event that inspired her political career. She worked on President Barack Obama\u2019s re-election campaign in 2012 and became a fellow for him that year. In 2016,"} +{"output_text": " don't know the language?\"\n\nThe commission's report, which was published in the summer of 2016, is a long, dense document that is difficult to read. It is written in a style that is at once impenetrable and opaque. The commission's conclusions are based on a series of interviews with Gil, who was not allowed to see the report before it was published.\n\nThe commission's report is a long, dense document that is difficult to read. It is written in", "input_text": " \"but it began to sound fishy when I found out some of the graffiti that was found while washing came from my work area. I was excavating with a trowel, bear in mind. I might miss one or two, but not 20. So I brought to the site one of those Chinese food tins to wash the sherds as we removed them and we didn't find anything.\"\n\nNot fazed by the mounting evidence against him, Gil's defense will include \"several sworn and signed testimonies from witnesses, among other things, concerning the finding of certain artifacts,\" Gil tells me. \"I'm an archaeologist, I'm not the person who certifies the authenticity of artifacts. We need to verify absolutely everything, but I haven't seen anything to make me think they are fake...The only thing I can do is put my faith in the justice system. Consciously or unconsciously [the commission and the \u00c1lava government] have destroyed my career.\"\n\n\n\n\n\nSome of the inscriptions seem to conveniently avoid damage that was done to the pottery after it was buried. (Courtesy \u00c1lava Provincial Government)\n\nThe first hearing took place on June 30, a few days before this article went to press. In court, Gil will likely try to pick apart the commission's conclusions, and he might score a few victories.\n\n\"People have put too much faith in the commission,\" says Canto. \"Its reports are not completely reliable. Linguistics doesn't always explain epigraphy. There are always exceptions. You can rarely say that a certain word is impossible because, like now, people wrote badly. But of course, there are texts like Deidre and CVORII that are beyond salvation. And several of the commission experts at one point believed [the inscriptions] to be authentic. How can they claim to be experts when they"} +{"output_text": ". A. Kozlov, A. A. Kozlov, A. A. Kozlov, A. A. Kozlov. A novel approach to the synthesis of 2-aryl-1,3-dioxanes via the reaction of 2-aryl-1,3-dioxolanes with aryl halides. Organic Letters 2019, 21 (23), 5763-5766. 21 (23), 5763-5766. https://doi.org/", "input_text": " C\u2013C Coupling and C\u2212N Condensation Cascade Reactions. Advanced Synthesis & Catalysis 2019, 361 (14), 3312-3317. 361 (14), 3312-3317. https://doi.org/10.1002/adsc.201900096\n\nBartholom\u00e4us Pieber, Jamal A. Malik, Cristian Cavedon, Sebastian Gisbertz, Aleksandr Savateev, Daniel Cruz, Tobias Heil, Guigang Zhang, Peter H. Seeberger. Semi\u2010heterogene duale Nickel\u2010/Photokatalyse mit Kohlenstoffnitriden: Veresterung von Carbons\u00e4uren mit Arylhalogeniden. Angewandte Chemie 2019, 131 (28), 9676-9681. 131 (28), 9676-9681. https://doi.org/10.1002/ange.201902785\n\nBartholom\u00e4us Pieber, Jamal A. Malik, Cristian Cavedon, Sebastian Gisbertz, Aleksandr Savateev, Daniel Cruz, Tobias Heil, Guigang Zhang, Peter H. Seeberger. Semi\u2010heterogeneous Dual Nickel/Photocatalysis using Carbon Nitrides: Esterification of Carboxylic Acids with Aryl Halides. Angewandte Chemie International Edition 2019, 58 (28), 9575-9580. 58 (28), 9575-9580. https://doi.org/10.1002/anie.201902785\n\nE. V. Matus, D. V. Nefedova, O. B. Sukhova, I. Z. Ismagilov, V. A. Ushakov, S"} +{"output_text": " it before, and it was a trick he\u2019d learned from a circus.\n\n\u201cHe was a very clever fellow,\u201d said the friend. \u201cHe was a good fellow, and he was a good friend.\u201d\n\nA man who fell from a horse\n\nThe death of a man who fell from a horse was a common enough occurrence in the 19th century.\n\nBut it was a particularly gruesome one.\n\nIn September 1894, a man named William H. Smith", "input_text": " be believable but safe, and hadn\u2019t budged in the scene. But Crozier leaned in. Maybe that wouldn\u2019t have mattered too much if Franks had used a harmless stage knife from the theatre\u2019s props department. Unfortunately he used his own \u2013 a sharp and slender stiletto with a jewelled handle.\n\nThe actor stumbled, turned twice from the blow and fell on his back with the dagger sticking in his chest. \u201cDon\u2019t worry, I\u2019m alright,\u201d Crozier told his unwitting killer.\n\nThree surgeons were speedily on the scene, but to no avail. \u201cDeceased moaned and expired,\u201d concluded the Evening Post.\n\nA man choked by a billiard ball\n\nAs stunts go, it left a little to be desired. But it was Walter Cowle\u2019s party piece, and he was going to stick to it.\n\nThe 24-year-old was in the pub with his pals in November 1893, when talk turned to the tricks they could perform.\n\nEager to show off, Walter asked the landlord of the Carlisle Arms in Soho for a billiard ball, then placed it in his mouth with a flourish, and closed his mouth.\n\nTa-da!\n\nUh-oh.\n\n\u201cHe evinced signs of choking,\u201d reported the Grantham Journal. \u201cHis back was slapped and his head held down, in the hope that the ball would fall forward and out of his mouth. It did not, however and Cowle was at once conveyed to Middlesex Hospital, where he was found to be dead.\n\n\u201cIt was only when the post-mortem examination was made by Dr Sidney Bulke, resident surgeon, that the ball could be extracted.\u201d\n\nHis friend told the inquest he\u2019d seen him do"} +{"output_text": "ardi, Ph.D.\n\n[3] National Institute on Alcohol Abuse and Alcoholism\n\n[4] National Institute on Drug Abuse\n\n[5] National Institute on Alcohol Abuse and Alcoholism\n\n[6] National Institute on Drug Abuse\n\n[7] National Institute on Alcohol Abuse and Alcoholism\n\n[8] National Institute on Drug Abuse\n\n[9] National Institute on Alcohol Abuse and Alcoholism\n\n[10] National Institute on Drug Abuse\n\n", "input_text": " the simplest task like walking to be dangerous. 31% of the pedestrians killed by cars were drunk.\n\nBinge drinking can also lead to fights resulting in murder, sexual assaults, drunk driving arrest (DUI) and or death and unplanned pregnancies.\n\nImpaired judgment can lead to embarrassing incidents, too many to list here.\n\nWhen it\u2019s Time to call an Ambulance\n\nAnother danger with underage binge drinking is when someone does exhibit signs of alcohol poisoning, teens are afraid to call for an ambulance, since they realize they are all breaking the law with underage drinking which might prevent them from calling for help.\n\nParamedics (911) should be called when any one of these symptoms occurs:\n\nThe person has passed out and cannot be woken\n\nCold, clammy, pale or bluish skin\n\nSlow breathing, fewer than 8 breaths per minute\n\nErratic breathing, 10 seconds or more between breaths\n\nVomiting\n\nLow body temperature\n\nSeizures\n\nConfusion, stupor or coma\n\nTrouble breathing after vomiting\n\nConclusion on Binge Drinking\n\nTeens and adults might read this and say these things won\u2019t happen to me when I drink. Remember, these statistics are full of people who said these things would never happen to them when they end up binge drinking.\n\nBinge drinking causes a person to get drunk fast, which causes the loss of good judgment fast. The do not realize how drunk they are getting when binge drinking. They might throw up and continue drinking getting so drunk they actually think they are sober until it could be too late.\n\n\u00a9 November 2, 2010 Sam Montana\n\nResources\n\n[1] NewsOK.com\n\n[2] American Heart Association\n\nCollege Drinking Prevention John Er"} +{"output_text": " she could even scream. She was carried out of the bathroom and into the bedroom. She was pushed onto the bed and her legs were spread wide. She was then tied to the bed with a rope. She was naked and helpless. She was not allowed to move. She was not allowed to scream. She was not allowed to cry. She was not allowed to do anything. She was not allowed to think. She was not allowed to feel. She was not allowed to do anything. She was", "input_text": " can just hold out till next week. When Victoria awoke the next Sunday morning, she remembered that she had given the housekeeper the day off. She was alone in her very large house. It had twenty three rooms, an outdoor swimming pool that was of little use in Seattle, and a smaller house for the servants. Cole hadn't told her when she would be kidnapped, but since he had asked her where she would go and when, she expected it to happen when she left the house to have lunch at her favorite Sunday restaurant. She told her driver to take the day off, so she would be free to go out alone. After a very light breakfast, she got into the shower and started to shave. She always shaved her labia and left just a little triangle of hair above. After shaving, she spent several minutes stroking herself and thinking about what might happen to her later that day. She was reasonably certain that Cole had understood her requests for rougher play. Just as she was on the verge of coming, she stopped the erotic self stimulation in order to leave herself with a nice edge. \"Oooh, gotta stop now,\" she said to herself. Having a hair stylist on call, she normally did not wash her own hair. Unfortunately, the stylist was not available today and she resigned herself to being without her perfect hairstyle until tomorrow. It's going to get messed up anyway, she hoped. Reaching for the shampoo, she proceeded to wash her long blonde hair. It pleased her to see that she remembered how. With her eyes tightly closed, she put her head under the shower stream and rinsed out the floral scented shampoo. Without warning, a strong arm reached past the frilly shower curtain and wrapped around her waist. She was lifted completely out of the tub and a black cloth bag was placed over her head before"} +{"output_text": " do centro hist\u00f3rico, \u201cpara que seja preservado e valorizado\u201d. Ao mesmo tempo, a entidade recomenda que sejam \u201cadotadas medidas de protec\u00e7\u00e3o\u201d para os edif\u00edcios hist\u00f3ricos, \u201cpara que sejam preservados e valorizados\u201d.\n\nAinda assim, a presidente do Icomos, que \u00e9 tamb\u00e9m presidente da Associa\u00e7\u00e3o de Turismo do Porto, diz que \u201cn\u00e3o se", "input_text": " car\u00e1cter \u00fanico.\u201d, descreve Soraya Genin.\n\nO melhor do P\u00fablico no email Subscreva gratuitamente as newsletters e receba o melhor da actualidade e os trabalhos mais profundos do P\u00fablico. Subscrever \u00d7\n\nA entidade consultiva avan\u00e7a com n\u00fameros recolhido junto do Registo Nacional de Turismo para o Porto, para justificar a conclus\u00e3o anterior: entre Abril de 2009 e Abril de 2010 fizeram-se oito pedidos para alojamento local na cidade; entre Abril de 2015 e Abril de 2016, os pedidos j\u00e1 eram 849. \u201cApenas no site do Airbnb existem 2710 unidades de alojamento no Porto\u201d, acrescenta-se. J\u00e1 a popula\u00e7\u00e3o do centro hist\u00f3rico, avalia o Icomos recorrendo aos \u00faltimos tr\u00eas Censos, desceu \u201cmais de metade\u201d, em rela\u00e7\u00e3o \u00e0 que era quando o centro hist\u00f3rico da cidade foi classificado pela UNESCO, em 1996.\n\nDepois da an\u00e1lise, dura, como de costume \u2013 e que j\u00e1 levou o presidente da C\u00e2mara do Porto, Rui Moreira, a acusar o organismo de \u201cportofobia aguda\u201d, a prop\u00f3sito do projecto para a Esta\u00e7\u00e3o de S. Bento \u2013, o Icomos deixa recomenda\u00e7\u00f5es que, frisa, devem ser tomadas de forma \u201curgente\u201d. A presidente do \u00f3rg\u00e3o consultor da UNESCO pede ao Comit\u00e9 do Patrim\u00f3nio Mundial que pe\u00e7a \u201cuma gest\u00e3o efectiva e de protec\u00e7\u00e3o\u201d"} +{"output_text": " that the FCC will be able to do anything to change that.\u201d\n\nThe FCC\u2019s net neutrality rules, which were adopted in 2015, prohibit broadband providers from blocking or throttling lawful content, or from charging companies for faster delivery of their content.\n\nThe rules also prohibit broadband providers from creating \u201cfast lanes\u201d for companies willing to pay more for faster delivery.\n\nThe rules were adopted after a years-long legal battle that pitted the FCC against the broadband industry. The agency", "input_text": " the government to address a perceived market failure with a complex regulatory regime.\n\nThat comparison may gain even more relevance now as the GOP prepares to repeal Obamacare \u2013 and take net neutrality along with it.\n\n\u201cExpect the Trump FCC to hit the ground running quickly,\u201d said Berin Szoka, president of the right-leaning think tank TechFreedom.\n\nTrump is expected to appoint a Republican FCC chair in 2017 who could vote to roll back Wheeler\u2019s decisions with the support of the agency\u2019s two other conservatives, Ajit Pai and Michael O\u2019Rielly. (Both declined to comment.) It is still unclear whom Trump may nominate as chair, and a Trump spokesman did not immediately respond to a request for comment.\n\nDemocrats could attempt to mount a defense by seeking congressional legislation that would enshrine the principles, if not the language, of the net neutrality rules in federal law. Consumer advocates, meanwhile, vowed to defend the privacy regulations from attack.\n\n\u201cWe expect to aggressively fight any attempt to allow AT&T, Comcast and other giant ISPs to spy on Americans,\u201d said Jeffrey Chester, executive director of the Center for Digital Democracy. \u201cWe hope that the new administration supports consumer privacy rules, including as a key component of ensuring a competitive marketplace.\u201d\n\nWith Republicans in control of all three branches of government, however, there is little incentive for a deal, analysts said.\n\nIn the end, whoever takes the helm at the FCC could shatter a system that until now had largely benefited Web companies over network operators.\n\n\u201cFor a while, the joke was that it\u2019s not the FCC, it\u2019s the NCC \u2013 the Netflix Communications Commission,\u201d said Roger Entner, a wireless analyst at Recon Analytics. \u201cEverything Netflix wanted, it basically got. It\u2019s hard to imagine [under Trump]"} +{"output_text": " beam will be focused into a smaller and smaller beam, until it is a single proton beam, with a diameter of about a millimeter. The beam will then be accelerated to a speed of nearly the speed of light, and then be sent through a series of magnets to focus it into a beam of protons, which will then be sent through a series of magnets to focus it into a beam of protons, which will then be sent through a series of magnets to focus it into a beam of protons,", "input_text": " 17 mile long tunnel filled with 1,232 pressurized cylinders weighing 35 tons each, with 11,000 amps of electricity running through them, super-cooled and incredibly magnetic?\n\no What would have happened to the land above?\n\no What would have happened to the Earth?\n\no What happens when this powerful beam is no longer going through dipoles but through the concrete into the surrounding earth?\n\no And what did happen as a result of this accident? Did it cause fractures in the rock, or cause an earthquake?\n\no What was the impact to the two villages that sat atop this disaster, a mere 330 feet above?\n\no How could a system be built that has no sensors and no warning systems for situations like this, that deals with this level of electricity, coolant, pressure, and force (see below)? Is this an example of how oblivious these physicists are about reality and proper safety procedures, or of how cock-sure they are that nothing will go wrong?\n\no Why have the French and Swiss governments exempted this facility from normal safety requirements such as sensors and warning systems, especially when their citizens are most in danger if something goes wrong?\n\nThe process of making the proton streams:\n\n\u201cAtom by atom, the electrons will be stripped from each hydrogen nucleus to create free protons, which will then be beamed into a series of four pre-accelerators of increasing size, one after another, in a kind of loop-de-loop, each pre-accelerator powering the beam up by a factor of 10 or 20 or 30, finally up to 3.5 [trillion electron volts] and\u20267 trillion electron volts. [and now, In 2015, the power will be nearly doubled to 13 trillion electron volts (TeV)] As the energy increases, the"} +{"output_text": " who is willing to take risks and push for bold policy changes. He has been willing to take on the teachers\u2019 unions, and has been willing to take on the teachers\u2019 unions. He has been willing to take on the teachers\u2019 unions, and has been willing to take on the teachers\u2019 unions. He has been willing to take on the teachers\u2019 unions, and has been willing to take on the teachers\u2019 unions. He has been willing to take on the teachers\u2019 unions, and has been", "input_text": " was used to rate schools. In order to achieve \u201cthree stars,\u201d a school needed a score of 100 points. The average score for RSD schools was 60.9 points. Ignoring this evidence, Pastorek and Vallas simply declared victory.\n\nRemember, that is what reformers do: Ignore the evidence and declare victory.\n\nThe Advent of John White, and Bobby Jindal\u2019s Second Term\n\nIn 2010, Jindal was nearing the end of his first term as governor, and he was keeping rhetoric toned down regarding his plans to follow the ALEC playbook and usher in \u201csweeping educational reforms\u201d should he be reelected in 2011. Jindal did begin campaigning early, in March 2011, following a seven-point drop in the polls. Jindal was perceived as a governor who rode in on opportunity (disfavor with democratic leadership during Katrina) then offered no \u201cfollow-through\u201d:\n\n[After Jindal\u2019s 2007 election,] disillusionment settled in quick. His \u201cblue ribbon\u201d ethics reform was marred by ineffective enforcement and his interest in state policy rode constantly in the backseat behind his national political ambitions. His leadership was defined not by its boldness or its ability to transform; but by its caution, his unwillingness to take meaningful risks even in pursuit of desirable policy outcomes. He stood on the right side of many issues but seemed unwilling to push too hard, and subsequently accomplished too little of substance. At the beginning of 2011, Jindal was looking like a status quo governor, an uninteresting \u201canti-tax\u201d cookie-cutter Republican who presided more than he led. [Emphasis added.]\n\nThe article continues to describe \u201csome change in Jindal\u201d:\n\nIn the past couple of weeks, Jindal has emerged as the governor"} +{"output_text": " He just looked at us.\n\n\u201cWhat\u2019s going on?\u201d I asked.\n\n\u201cYou\u2019re being suspended.\u201d\n\n\u201cFor what?\u201d\n\n\u201cFor being in possession of a weapon.\u201d\n\n\u201cWhat weapon?\u201d\n\n\u201cA knife.\u201d\n\n\u201cWhat knife?\u201d\n\n\u201cA knife.\u201d\n\n\u201cWhat knife?\u201d\n\n\u201cA knife.\u201d\n\n\u201cWhat knife?\u201d\n\n\u201cA knife.\u201d\n\n\u201cWhat knife?\u201d\n\n", "input_text": " noticed was my backpack, on the desk, surrounded by its contents, and next to it, Shirelle\u2019s bag and its contents, too. That was when I noticed Shirelle, sitting on a low sofa. The cop indicated the spot next to her with a tilt of his head and I sat. The late-afternoon sun slanting through the window caught the huge fart of dusty air that escaped from its cushions when I settled in. Shirelle coughed a little and caught my eye. She looked scared. Really scared.\n\nThe cop pulled the vice principal\u2019s chair out from behind the desk and sat down on it in front of us. He didn\u2019t say anything. He was young, I saw, not much older than us, and still had some acne on one cheek. White dude. Not my type, but good looking, except that he was a cop and he was playing mind games with us.\n\n\u201cAre we being detained?\u201d Somewhere in my bag was a Black Lives Matter bust-card and while I\u2019d forgotten almost everything written on it, I remembered that this was the first question I should ask.\n\n\u201cYou are here at the request of your school administration.\u201d Oh. Even when there wasn\u2019t a fresh lockdown, the administration had plenty of powers to search us, ask us all kinds of nosy questions. And after a lockdown? Forget it.\n\n\u201cAre we entitled to lawyers?\u201d Shirelle\u2019s voice was a squeak, but I was proud of her. She remembered the second line from the bust-card.\n\n\u201cYou are not.\u201d The cop looked smackably smug.\n\nI didn\u2019t say anything. That was definitely the third line of the bust-card. Keep your damned mouth shut.\n\nHe didn\u2019t say anything either."} +{"output_text": " we\u2019re going to go into the next one.\n\nG: Okay.\n\nJ: So, we\u2019re going to go into the next one.\n\nG: Okay.\n\nJ: So, we\u2019re going to go into the next one.\n\nG: Okay.\n\nJ: So, we\u2019re going to go into the next one.\n\nG: Okay.\n\nJ: So, we\u2019re going to go into the", "input_text": " the cap on Social Security.\n\nYes we should raise the cap on \u2013 we should eliminate the cap on Social Security. FICA tax. Which pays into Social Security, helps pay for Social Security (which is nonsense, which is nonsense). But FICA pays into Social Security, so we feel invested and years later say, \u201cI\u2019ve been paying my whole life, don\u2019t take my Social Security away.\u201d It\u2019s complete nonsense but setting that aside.\n\nBernie Sanders says we should eliminate the cap on Social Security and we absolutely should. But not because we need the money. Because Social Security isn\u2019t paid for by taxes. It\u2019s not paid for by FICA taxes. FICA taxes just disappear and they go away. They\u2019re not used to pay for Social Security because simply because of everything we\u2019ve been talking about. You eliminate the cap on Social Security taxes, not because we need the money, not because Social Security is running out of money (because it doesn\u2019t make sense for a federal program to run out of money. That doesn\u2019t make sense) We eliminate the cap because it\u2019s a regressive tax, that only the powerless are paying that tax.\n\nThe powerful are not paying that tax. We should really be eliminating that tax for everybody. That\u2019s what we should do. Because it\u2019s just a trick to make people feel like they\u2019re invested. It is not actually paying for things. There\u2019s actually a quote from FDR saying it was just you know I did it because there\u2019s no way that people are going to take it away from them because they\u2019re paying into it. But they\u2019re not. So, I\u2019m not going to go into Social Security but that\u2019s a really interesting subject.\n\nG: Okay.\n\nJ: Okay. So,"} +{"output_text": "Lopez\u2019s new gym, Factory X, was born out of a desire to create a place where fighters could train and develop their skills.\n\n\u201cI wanted to create a place where fighters could come and train and develop their skills,\u201d he says. \u201cI wanted to create a place where fighters could come and train and develop their skills. I wanted to create a place where fighters could come and train and develop their skills. I wanted to create a place where fighters could come and train and", "input_text": " fought for M1 and Shooto on cards across continental Europe. His last fight came in 2010, against Matteo Piran at the Abu Dhabi Fighting Championships. He won by TKO in the first round.\n\nThough he did not make it to the UFC, Lopez was delighted with his MMA career; thanks mostly to what he had been through prior. \u201cI was so happy, because I was the guy who was supposed to die,\u201d he says. \u201cI was \u2018killed\u2019. So even when I was losing, it was a victory for me. I was so happy to be alive and to be healthy.\u201d\n\nBy 2010, Lopez was in his early thirties and essentially retired from MMA competition. But, over the course of his short-lived MMA career, he had started coaching alongside Nicourt at Free Fight Academy. When he hung up his own gloves he focused purely on coaching, putting INSEP qualifications in strength training, nutrition, and sports performance to good use.\n\nAlong with regularly coaching four MMA fighters, Lopez started acting as a talent scout for the gym. It was a role he relished and proved to be pretty good at, bringing in talented fighters who other gyms hadn\u2019t given much of a chance.\n\nAfter a few years, however, he decided to leave Nicourt\u2019s gym. A big reason behind the move was a difficult situation involving a fighter he had been working very closely with. The fighter, who Lopez says was like a son to him, decided he didn\u2019t want Lopez in his corner anymore, instead favoring another trainer at the gym.\n\nDistraught, Lopez decided he needed a fresh start and a new camp. However, he had no desire to join someone else\u2019s business. Instead he wanted to start his own.\n\nTHE FACTORY THAT FOUND NGANNOU\n\n"} +{"output_text": " results... *\n\nEmail This field is for validation purposes and should be left unchanged.\n\nCompleting this poll entitles you to 100 Percent Fed Up updates free of charge. You may opt out at anytime with a single click. Here's our Privacy Policy.\n\nThe NFL has been in the crosshairs of the left for a long time, and the NFL has been in the crosshairs of the right for a long time. The NFL has been in the crossh", "input_text": " Donktum\ud83d\udd39 (@CarpeDonktum) September 6, 2018\n\nThis is priceless! Its truly an amazing accomplishment!\n\nOUR PREVIOUS REPORT ON THE NIKE COMMERCIAL:\n\nTonight marks the start of the NFL season, with the first televised game. Tonight also marks the the debut of Nike\u2019s decision to make Colin Kaepernick, the sub-par, former San Fransisco 49 er\u2019s quarterback the face of their new 30th Annivesary \u201cJust Do It\u201d ad campaign.\n\nNike made the interesting decision to to make Colin Kaepernick the face of \u201csacrifice\u201d. So what exactly did Kaepernick sacrifice? He was already a sub-par quarterback, who after being released by San Fransisco, was likely not going to get picked up by another NFL team, and the Black Lives Matter domestic terror group, Kaepernick was defending with his actions, has pretty much gone underground after their fearless, anti-cop leader, Barack Obama left office.\n\nIf Nike truly wanted to honor those who \u201cbelieve in something\u201d and \u201csacrifice everything\u201d they need look no further than the grave markers at Arlington National Cemetery.\n\nAs the calls for boycotts of the NFL grow louder each year, many are wondering if the NFL can survive another season of anti-American, anti-cop, antics by overpaid players, many of them with criminal records, who were inspired by Colin Kaepernick, the first person to kneel, and later to sit on the bench, as a show of disdain for our nation, and for our law enforcement community, while our national anthem was being played.\n\nShould Trump appoint a new Supreme Court Justice before the election?\n\nYes No Just show the results\n\nEnter your email to see the"} +{"output_text": "\u03b8\u03b5\u03c3\u03b7!\n\n\u0393\u03b9\u03b1 \u03c4\u03b7\u03bd \u03b1\u03bd\u03b1\u03c1\u03c7\u03b9\u03ba\u03ae \u03b1\u03b3\u03c9\u03bd\u03af\u03b1!\n\n\u0393\u03b9\u03b1 \u03c4\u03b7\u03bd \u03b1\u03bd\u03b1\u03c1\u03c7\u03b9\u03ba\u03ae \u03b1\u03b3\u03c9\u03bd\u03af\u03b1!\n\n\u0393\u03b9\u03b1 \u03c4\u03b7\u03bd \u03b1\u03bd\u03b1\u03c1\u03c7\u03b9\u03ba\u03ae \u03b1\u03b3\u03c9\u03bd\u03af\u03b1!\n\n\u0393\u03b9\u03b1 \u03c4\u03b7\u03bd \u03b1\u03bd\u03b1\u03c1\u03c7\u03b9\u03ba\u03ae \u03b1\u03b3\u03c9\u03bd\u03af\u03b1!\n\n\u0393\u03b9\u03b1 \u03c4\u03b7\u03bd \u03b1\u03bd\u03b1\u03c1\u03c7\u03b9\u03ba\u03ae \u03b1\u03b3\u03c9\u03bd\u03af\u03b1!\n\n\u0393\u03b9\u03b1 \u03c4\u03b7\u03bd \u03b1\u03bd\u03b1\u03c1\u03c7\u03b9\u03ba\u03ae \u03b1\u03b3\u03c9\u03bd\u03af\u03b1!\n", "input_text": "\u03bd\u03b1\u03c2 \u03bc\u03b1\u03c2 \u03b5\u03af\u03bd\u03b1\u03b9 \u03ba\u03bf\u03b9\u03bd\u03cc\u03c2 \u03ba\u03b1\u03b9 \u03bc\u03bf\u03b9\u03c1\u03b1\u03b6\u03cc\u03bc\u03b1\u03c3\u03c4\u03b5 \u03c4\u03b7\u03bd \u03af\u03b4\u03b9\u03b1 \u03c7\u03b1\u03c1\u03ac \u03ba\u03b1\u03b9 \u03c4\u03bf\u03c5\u03c2 \u03af\u03b4\u03b9\u03bf\u03c5\u03c2 \u03c0\u03cc\u03bd\u03bf\u03c5\u03c2 \u03bc\u03b5 \u03cc\u03bb\u03bf\u03c5\u03c2 \u03b1\u03c5\u03c4\u03bf\u03cd\u03c2 \u03c4\u03bf\u03c5\u03c2 \u03b1\u03bd\u03b8\u03c1\u03ce\u03c0\u03bf\u03c5\u03c2 \u03c0\u03bf\u03c5 \u03b4\u03b9\u03b1\u03c7\u03ad\u03bf\u03c5\u03bd \u03c4\u03bf \u03b4\u03b7\u03bb\u03b7\u03c4\u03ae\u03c1\u03b9\u03bf \u03c4\u03b7\u03c2 \u03b5\u03bb\u03b5\u03c5\u03b8\u03b5\u03c1\u03af\u03b1\u03c2 \u03c3\u03c4\u03bf\u03bd \u03b5\u03be\u03bf\u03c5\u03c3\u03b9\u03b1\u03c3\u03c4\u03b9\u03ba\u03cc \u03ba\u03bf\u03b9\u03bd\u03c9\u03bd\u03b9\u03ba\u03cc \u03b9\u03c3\u03c4\u03cc.\n\n\u039a\u03ac\u03c0\u03bf\u03c5 \u03b5\u03b4\u03ce \u03c6\u03c4\u03ac\u03bd\u03c9 \u03c3\u03c4\u03bf \u03c4\u03ad\u03bb\u03bf\u03c2 \u03b1\u03c5\u03c4\u03ae\u03c2 \u03c4\u03b7\u03c2 \u03b1\u03c6\u03ae\u03b3\u03b7\u03c3\u03b7\u03c2.\n\n\u0391\u03c5\u03c4\u03cc\u03c2 \u03ae\u03c4\u03b1\u03bd \u03bf \u0391\u03bb\u03ad\u03be\u03b1\u03bd\u03b4\u03c1\u03bf\u03c2 \u03ba\u03b1\u03b9 \u03b1\u03c5\u03c4\u03cc\u03c2 \u03b5\u03af\u03bc\u03b1\u03b9 \u03b5\u03b3\u03ce. \u0394\u03b5\u03bd \u03bc\u03b5\u03c4\u03b1\u03bd\u03b9\u03ce\u03bd\u03c9 \u03b3\u03b9\u03b1 \u03ba\u03ac\u03c4\u03b9 \u03ba\u03b1\u03b9 \u03b5\u03be\u03b1\u03ba\u03bf\u03bb\u03bf\u03c5\u03b8\u03ce \u03bd\u03b1 \u03c0\u03b9\u03c3\u03c4\u03b5\u03cd\u03c9 \u03cc\u03c4\u03b9 \u03b7 \u03bc\u03cc\u03bd\u03b7 \u03b1\u03be\u03b9\u03bf\u03c0\u03c1\u03b5\u03c0\u03ae\u03c2 \u03b5\u03c0\u03b9\u03bb\u03bf\u03b3\u03ae \u03c3\u03c4\u03b9\u03c2 \u03bc\u03ad\u03c1\u03b5\u03c2 \u03bc\u03b1\u03c2 \u03b5\u03af\u03bd\u03b1\u03b9 \u03b1\u03c5\u03c4\u03ae \u03c4\u03bf\u03c5 \u03c0\u03bf\u03bb\u03cd\u03bc\u03bf\u03c1\u03c6\u03bf\u03c5 \u03b1\u03bd\u03b1\u03c4\u03c1\u03b5\u03c0\u03c4\u03b9\u03ba\u03bf\u03cd \u03b1\u03b3\u03ce\u03bd\u03b1 \u03b3\u03b9\u03b1 \u03c4\u03b7\u03bd \u03b1\u03bd\u03b1\u03c1\u03c7\u03af\u03b1. \u0393\u03b9\u03b1 \u03cc\u03bb\u03bf\u03c5\u03c2 \u03c4\u03bf\u03c5\u03c2 \u03bb\u03cc\u03b3\u03bf\u03c5\u03c2 \u03c4\u03bf\u03c5 \u03ba\u03cc\u03c3\u03bc\u03bf\u03c5 \u03b7 \u03b1\u03bd\u03b1\u03bc\u03ad\u03c4\u03c1\u03b7\u03c3\u03b7 \u03bc\u03b5\u03c4\u03b1\u03be\u03cd \u03c4\u03bf\u03c5 \u03ba\u03cc\u03c3\u03bc\u03bf\u03c5 \u03c4\u03b7\u03c2 \u03b5\u03bb\u03b5\u03c5\u03b8\u03b5\u03c1\u03af\u03b1\u03c2 \u03ba\u03b1\u03b9 \u03c4\u03bf\u03c5 \u03ba\u03cc\u03c3\u03bc\u03bf\u03c5 \u03c4\u03b7\u03c2 \u03c5\u03c0\u03bf\u03b4\u03bf\u03cd\u03bb\u03c9\u03c3\u03b7\u03c2 \u03b8\u03b1 \u03c3\u03c5\u03bd\u03b5\u03c7\u03b9\u03c3\u03c4\u03b5\u03af \u03bc\u03ad\u03c7\u03c1\u03b9\u03c2 \u03b5\u03c3\u03c7\u03ac\u03c4\u03c9\u03bd.\n\n\u03a4\u03b9\u03bc\u03ae \u03b3\u03b9\u03b1 \u03c0\u03ac\u03bd\u03c4\u03b1 \u03c3\u03b5 \u03cc\u03bb\u03bf\u03c5\u03c2 \u03c4\u03bf\u03c5\u03c2 \u03bd\u03b5\u03ba\u03c1\u03bf\u03cd\u03c2 \u03c4\u03bf\u03c5 \u03b1\u03c0\u03b5\u03bb\u03b5\u03c5\u03b8\u03b5\u03c1\u03c9\u03c4\u03b9\u03ba\u03bf\u03cd \u03b1\u03b3\u03ce\u03bd\u03b1!\n\n\u0393\u03b9\u03b1 \u03ad\u03bd\u03b1\u03bd \u039c\u03b1\u03cd\u03c1\u03bf \u0394\u03b5\u03ba\u03ad\u03bc\u03b2\u03c1\u03b7!\n\n\u0393\u03b9\u03b1 \u03c4\u03b7\u03bd \u03b1\u03bd\u03b1\u03c1\u03c7\u03b9\u03ba\u03ae \u03b1\u03bd\u03c4\u03b5\u03c0\u03af"} +{"output_text": " said, the Seattle area has been a seller's market. But the last few years have been a buyer's market, with the number of homes for sale increasing by about 20 percent a year.\n\n\"It's a very different market than it was 10 years ago,\" Akeson said. \"It's a very different market than it was 20 years ago.\"\n\nThe Seattle area's housing market is still a seller's market, but it's a seller's market with a lot of", "input_text": " a block from Green Lake, and the aforementioned increased love for mid-century homes. And this isn't just any mid-century. It's a notable home, custom built, with a distinctive panoramic living room window in front. The kind of place people walk by and say, \"If that ever hits the market...\"\n\nThe sellers got an inspection of the home, including the sewer line, to head off any issues. They replaced an old electrical panel and were getting an underground oil tank properly decommissioned.\n\nFinally, although home inventory has been increasing in recent months, it's still historically low. Seattle had just 1.26 months' worth of homes for sale in May, based on that month's sales pace, according to the Northwest Multiple Listing Service. That's up from 1.18 months of inventory a year earlier, but well below the four to six months considered the balance point between buyers and sellers. The area that takes in Green Lake (and everywhere else in Seattle north of the Ship Canal and west of Interstate 5) was even tighter, with 0.81 months of inventory.\n\nSeattle-based online real estate company Redfin said in its most-recent \"Bidding Wars\" report that 62.1 percent of its offers in the Seattle area faced competition in March, down from 71.9 percent a year earlier, and 41.9 percent of homes sold for more than asking price. Looking at Redfin's winning offers, 37.2 percent had a cover letter (in which hopeful buyers often write about how much they love a home and how well they'll take care of it), 23.3 percent waived the right to back out of a sale if an inspection uncovered an issue the owners wouldn't fix, 14 percent were all-cash and 11.6 percent waived the right to bail if financing fell through.\n\nFor decades, Akeson"} +{"output_text": " of congestion pricing. It also called for the creation of a national park in the Meadowlands, a proposal that has been met with a mixture of derision and enthusiasm.\n\nThe RPA\u2019s plan is a little different from the others in that it\u2019s not a blueprint for the future, but a proposal for the present. It\u2019s a proposal for the future that\u2019s been made in the present, and it\u2019s a proposal that\u2019s been made by a", "input_text": "\n\nThe slice of the New Jersey Meadowlands seen from the train window is an unmatched panorama of glorious incongruity. Robert Freudenberg, the vice president of energy and environmental programs for an urban research organization called the Regional Plan Association (RPA), used to commute from his home in New York City to a job in Trenton, New Jersey, where he worked for the Department of Environmental Protection. He\u2019d take the train daily, and \u201cthere would always be this moment\u2026 I remember sitting on the train in those days, and every so often seeing an egret.\u201d\n\nI\u2019m not surprised, therefore, that this common experience is part of what inspired a wonderfully contrarian proposal, one that would turn the Meadowlands, with its unexpected serenity and its surfeit of industrial detritus, into a national park.\n\nThe RPA, where Freudenberg works, is a relatively obscure organization founded in 1922 to try to shape the growth of the New York City metropolitan area, a wonky sidekick to the Port Authority that emerged around the same time to manage the transportation infrastructure shared by New York City and New Jersey. As its name suggests, the RPA issues plans for the future, ambitious compendia released at odd intervals that tend to say more about contemporary attitudes toward cities than they do about what the coming decades will hold, but that have also been quite influential. The first Regional Plan, released in 1929, pitched the rats\u2019 nest of area highways that Robert Moses subsequently took it upon himself to build; the 1996 plan, the organization\u2019s third, called for building a connection between the Long Island Railroad and Grand Central Terminal, a project now well underway.\n\nThe Fourth Regional Plan was released at the end of last year and strongly advocated repairing and expanding the subway system, building more affordable housing, and implementing a California-style system"} +{"output_text": " a ditch, the government is in meltdown, the country is in a state of emergency, the government is in crisis, the government is in a mess, the government is in a mess, the government is in a mess, the government is in a mess, the government is in a mess, the government is in a mess, the government is in a mess, the government is in a mess, the government is in a mess, the government is in a mess, the government is in a", "input_text": " by what seemed to be a fibre-optic version of a Victorian counting house - a squad of young people sitting at rows of desks, on the phone bending the ears of journalists. At the top \u2013 can he really have been sitting at a higher desk? - that\u2019s certainly how I think I remember it \u2013 sat the brooding figure of Alastair Campbell. The scene showed how thoroughly priorities had changed: where once government used the room to control and discipline its MP\u2019s in parliament it now used it to try something similar with the media. If you read Alastair Campbell\u2019s diaries \u2013 which will turn out to be such a gold mine for future\u2026. psychiatrists \u2013 you get the first hand version. This was an administration so obsessed with its own PR that the man hired to handle it is even drafting the resignation letters of people who quit the government as a matter of principle. My own theory about why the diaries are only 789 pages long is that he ran out of expletives to use to describe the media. But the fact he came loathe the trade he had once practised shouldn\u2019t blind us to the fact the may have a point or two.\n\nIn his speech, which managed to avoid the words wanker, prat, shit and the like \u2013 obviously not drafted by Alastair - Blair admitted that a vast amount of the work of his government \u2013 perhaps too much - had been devoted to handling the media. He justified it by claiming this was because we in the media pay little attention to what goes in places like parliament because we\u2019re obsessed by impact. In a choice between impact and accuracy, he said, impact wins. Scandal or controversy beats ordinary reporting hands down. He went on to accuse us of using extravagant language: every problem\u2019s a crisis, policies don\u2019t run into difficulty, they end up in"} +{"output_text": " flavor. The tea is still very smooth and easy to drink.\n\n2013 Bai Sha Xi Factory Fu Brick Tea\n\n2013 Bai Sha Xi Factory Fu Brick Tea\n\n2013 Bai Sha Xi Factory Fu Brick Tea\n\n2013 Bai Sha Xi Factory Fu Brick Tea\n\n2013 Bai Sha Xi Factory Fu Brick Tea\n\n2013 Bai Sha Xi Factory Fu Brick Tea\n\n2013 Bai Sha Xi Factory Fu Brick Tea\n\n2013", "input_text": " post-fermented tea are harvesting, withering, low temperature pan firing (or steaming), rolling, drying, fermenting, and a final drying.\n\nThis means that there is a lot of room for variation! The unique terroir of the harvesting region, the microbial environment that the tea is produced in, and the fermentation method used can all greatly impact the final qualities in the brewed tea. In order to expand my concept of what the category of fermented tea can include, I am drinking a kind of tea called Fu brick tea for this blog post. This kind of tea is produced in China\u2019s Hunan province.\n\n2013 Bai Sha Xi Factory Fu Brick Tea\n\nBai Sha Xi is a factory in Hunan Province\u2019s Anhua County. It was founded in 1940, and has been producing Fu Brick Tea (\u832f\u7816\u8336 Fu Zhuan Cha in Chinese) for many years. The fermentation method used to make the 2013 Fu Zhuan Cha I enjoyed today is proprietary. It is common for tea factories to protect their recipes. The vendor did say that the bricks are pressed with golden flower spores, which are allowed to thrive for 3 days. After the 3 days have passed, the bricks are baked at a low temperature to dry the tea before packaging.\n\nThe smell of the dry tea leaves is instantly recognizable as the smell of pipe tobacco. It has a pungent and fruity sweet quality that make it seem as if the tea were aromatized.\n\nThe flavor of the tea is much mellower. It brews up a golden orange color, and has a woody flavor with a quality that reminds me of the dry leaves\u2019 pipe tobacco aroma. The texture is smooth, simple, and comfortable.\n\nAfter a few infusions the tea opened up leaving a somewhat more complex woody"} +{"output_text": " But it\u2019s not a dead zone. It\u2019s a fault that\u2019s been slowly building up strain for thousands of years.\n\nThe New Madrid Seismic Zone is a bit of a mystery. It\u2019s not a dead zone, but it\u2019s not a live zone, either. It\u2019s a zone that\u2019s been building up strain for hundreds of thousands of years, and it\u2019s not going to stop.\n\nThe New Madrid Seismic Zone is a bit", "input_text": " that was robbed. \u201cIf you say, \u2018OK, the really dangerous place is where the last earthquake was,\u2019 you get into this whack-a-mole situation,\u201d he said. If the New Madrid Seismic Zone really is shutting down, then it\u2019s someone else\u2019s turn. North America isn\u2019t going to stop deforming and all that strain has to go somewhere. Somewhere in the country, perhaps some fault has been imperceptibly building up strain for hundreds of thousands of years and will soon release it. Perhaps it\u2019s the blank areas of our hazard maps that should worry us. In Nepal, the country\u2019s earthquake hazard map showed the two ends of the country glowing with seismic hazard and a placid interior. Last April an enormous quake struck this unremarkable center of the country and killed more than 8,000 people. And New Madrid isn\u2019t the only bizarre quake site far from a plate boundary in the United States. In 1886, a magnitude 7 earthquake leveled Charleston, South Carolina. In 1727, Massachusetts Bay Colony preacher, witch-hater and would-be Infowars.com freelancer, Cotton Mather was shaken by an earthquake that hit New Hampshire and rattled Boston. Mather blamed Providence for the shaking. \u201cShall we say, All this is but a Chance that happens to us or the mere unguided Motion of Matter?\u201d he wrote. \u201cAh, profane Philistine!\u201d\u201d\n\nThree decades later, another earthquake struck off the coast of Gloucester and rocked the city once more. A similar earthquake today might devastate Boston\u2019s Back Bay neighborhood, which is built on landfill that would liquefy during heavy shaking. In New York City, the nearby Ramapo Fault hasn\u2019t had a major earthquake on it, as far as anyone can tell, for hundreds of years."} +{"output_text": "\u00e2ce \u00e0 leur CV-driven. Ils ont \u00e9t\u00e9 pay\u00e9s \u00e0 la mode, et ont \u00e9t\u00e9 embauch\u00e9s par des entreprises qui n\u2019avaient pas besoin de leur expertise.\n\nJe ne sais pas si c\u2019est une tendance de fond, mais je connais plusieurs anciens \u00e9l\u00e8ves s\u2019\u00e9tant lanc\u00e9s imm\u00e9diatement en tant que freelance, gr\u00e2ce \u00e0 leur CV-driven. Ils ont \u00e9t\u00e9", "input_text": " ces hypoth\u00e8ses.\n\nJ\u2019ai pris des frameworks RAD comme exemple, mais bien entendu, ces mots clefs sont rempla\u00e7ables par n\u2019importe quel autre, \u00e0 n\u2019importe quelle \u00e9chelle : du moment qu\u2019une technologie a \u00e9t\u00e9 choisie ind\u00e9pendamment du probl\u00e8me, avec pour but d\u2019\u00e9conomiser un maximum d\u2019argent lors de la cr\u00e9ation du produit, ou pour tenter de pouvoir recruter du junior pressurisable, la situation est identique.\n\nCV-driven\n\nUne mani\u00e8re d\u2019augmenter ses tarifs dans un tel \u00e9cosyst\u00e8me, est de faire du CV-driven development. Le CV-driven development consiste \u00e0 imposer des technologies \u00e0 la mode sur un produit, ind\u00e9pendamment de leur pertinence, afin ensuite de trouver un emploi bien pay\u00e9 en tant qu\u2019expert de cette technologie.\n\nSi vous vous vendez sur vos connaissances techniques, vous allez aller dans des bo\u00eetes en accord avec \u00e7a. \u00c9tant donn\u00e9 que le CV-driven donne rarement de bons d\u00e9veloppeurs, vous allez donc \u00eatre pisseur. Il est possible d\u2019\u00eatre mieux pay\u00e9 dans cette cat\u00e9gorie, en surfant sur la vague de la hype, mais votre valeur ne sera jamais per\u00e7ue comme sup\u00e9rieure \u00e0 celle d\u2019un autre CV-driven plus jeune, et n\u00e9cessairement moins cher.\n\nJe ne sais pas si c\u2019est une tendance de fond, mais je connais plusieurs anciens \u00e9l\u00e8ves s\u2019\u00e9tant lanc\u00e9s imm\u00e9diatement en tant que freelance, gr"} +{"output_text": " with the Tumblr community, and they were impressed by the quality of the content. \"We were seeing a lot of really interesting stuff,\" says Karp. \"We were seeing a lot of really interesting stuff.\"\n\nThe Tumblr team was also impressed by the way the site was being used. \"We were seeing a lot of really interesting stuff,\" says Karp. \"We were seeing a lot of really interesting stuff.\"\n\nThe Tumblr team was also impressed by the way the", "input_text": " Tumblr to express themselves. Danah Boyd, a senior researcher at Microsoft Research who studies social networks, says: \"I saw people who were frustrated with Facebook at not being able to express themselves. Tumblr became a powerful complement to Facebook. It didn't require the heavy-handed 'we must be friends' that is so ingrained in Facebook.\"\n\nBy March, Tumblr users were making 10,000 posts each hour. Karp and Arment continued consulting. The site cost about $5,000 a month to run, so they began speaking to a few angel investors and venture capitalists. In October 2007, they sold 25 per cent of the company to Spark Capital and Union Square Ventures, and betaworks head John Borthwick and Vimeo founder Jakob Lodwick, for $750,000, valuing the fledgling company at $3 million.\n\nKarp realised that \"we had a network of cool blogs\" -- online art galleries such as Eat Sleep Draw and meta webcomics such as Garfield Minus Garfield. But it was difficult to discover new tumblelogs. Users started self-organising: they would write posts, asking for others' Tumblr addresses, then post a complete list of these accounts to their own tumblr. \"They were hacking the network,\" says Karp. \"Even though they had no tools at all, they were finding each other and drawing these lines between each other.\" Users were connecting with one another, drawing lines on a social graph based around their interests: out of the images and pithy quotes, a network was emerging.\n\nKarp added directories, which grouped blogs by the tags they assigned themselves, and he allowed those not signed up to browse Tumblr sites. He also created the Radar -- interesting blogs selected by Karp and Arment. The two were keeping up"} +{"output_text": ".8 percent.\n\nThe NFL\u2019s expansion to 32 teams in 2002 was supposed to be a game-changer. The league was growing, and the NFL was growing with it.\n\nBut the NFL\u2019s expansion to 32 teams in 2002 was a game-changer for the wrong reasons.\n\nThe NFL\u2019s expansion to 32 teams in 2002 was a game-changer for the wrong reasons.\n\nThe NFL\u2019s expansion to 32 teams in 2002 was a", "input_text": " femme and non-binary individuals in this conversation is incredibly important, as erasure of their existence as such (and assumptions that they are heterosexual and/or cisgender women) only serves to exacerbate the justice issues raised here. If these terms are new for you, see here and here.\n\n[1] While most women and femmes will use the pronouns she/her/hers, this is not necessarily the case for everyone, especially non-binary individuals. I have intentionally chosen to refer to the associate pastor using the pronouns she/her/hers in this article, as a way to focus on gender dynamic at play, but if your associate colleague uses different pronouns, then you should clearly use them, and educate and hold your congregation accountable for doing the same, even if it means you need to practice the right pronouns in a mirror for five minutes every day.\n\n[2] While this article focuses squarely on the behavior of older malesenior pastors, a number of associate pastors also report similarly destructive and dismissive behavior on the part of their senior female colleagues. Pointing this out runs the risk of feeding the #NotAllMen apologists, but it\u2019s worth noting that almost all the advice offered here applies to senior ministers of any gender in relationship with their associate colleagues.\n\nThe Rev. Andrea Roske-Metcalfe serves as the associate pastor of Grace Lutheran Church in Apple Valley, Minnesota. She earned her MDiv from Union Theological Seminary in New York City, and she lives in Minneapolis with her husband, Luke, and their two daughters.\n\nImage by: Don Blumenstein Used with permission CLEVELAND, Ohio -- Since the NFL expanded to 32 teams and split into eight divisions, 109 teams between 2002-18 started the season 2-4.\n\nTen made the playoffs. That\u2019s 9"} +{"output_text": "[_] Anon 1737987 >># I'm not sure if I'm supposed to be posting this, but I'm going to anyway.\n\n\n\n[G5P8EAI] http://boards.4chan.org/f/res/1737987 ARCHIVED Discovered: 7/8 -2012 08:22:00 Ended: 8/8 -2012 05:50:22 Flashes: 1 Posts: 8\n\nFile: SKEET_", "input_text": "...\n\n>> [_] Anon 1760436 \n\n>> [_] Anon 1760437 This guy's flashes are always welcome and highly encouraged.\n\n>> [_] Anon 1760450 would have loved to see her own shit getting fucked up into her\n\n>> [_] Anon 1760489 This thing would be much better with a pony instead the girl\n\n>> [_] Anon 1760493 >># Fuckken LOL'd.\n\n\n\n[UDQWB87] http://boards.4chan.org/f/res/1755943 ARCHIVED Discovered: 1/9 -2012 09:06:41 Ended: 2/9 -2012 02:30:47 Flashes: 1 Posts: 5\n\nFile: SKEET_FIGHTER.swf-(1.05 MB, Hentai)\n\n[_] Anon 1755943 Marked for deletion (old).\n\n>> [_] Anon 1755944 OH YEAH GURL OH YEAH GURL OH MYGOD IT FEELS SO GOOD AROUND MY DICK GURL\n\n>> [_] Anon 1755950 AWWWW YEAH, GIMMIE THE CHOCOLATE.\n\n>> [_] Anon 1755954 dat eye flutter\n\n>> [_] Anon 1756045 why do I want to fap to this\n\n\n\n[G5P8EAI] http://boards.4chan.org/f/res/1737987 ARCHIVED Discovered: 7/8 -2012 08:22:00 Ended: 8/8 -2012 05:50:22 Flashes: 1 Posts: 8\n\nFile: SKEET_FIGHTER.swf-(1.05 MB, Other)\n\n"} +{"output_text": "resse collettivo.\n\nUnit\u00e0 immobiliari speciali per funzioni pubbliche o di interesse collettivo\n\nGruppo W\n\n(Unit\u00e0 immobiliari speciali per funzioni pubbliche o di interesse collettivo)\n\nW/1 \u2013 Stazioni per servizi di trasporto terrestri, marittimi, aerei ed impianti di risalita.\n\nW/2 \u2013", "input_text": "\n\nGruppo T\n\n(Unit\u00e0 immobiliari a destinazione terziaria)\n\nT/1 \u2013 Negozi e locali assimilabili.\n\nT/2 \u2013 Magazzini, locali da deposito e laboratori artigianali.\n\nT/3 \u2013 Fabbricati e locali per esercizi sportivi.\n\nT/4 \u2013 Pensioni.\n\nT/5 \u2013 Autosilos, autorimesse e parcheggi a raso di tipo pubblico.\n\nT/6 \u2013 Stalle, scuderie e simili.\n\nT/7 \u2013 Uffici, studi e laboratori professionali.\n\nUnit\u00e0 immobiliari speciali\n\nGruppo V\n\n(Unit\u00e0 immobiliari speciali per funzioni pubbliche o di interesse collettivo)\n\nV/1 \u2013 Stazioni per servizi di trasporto terrestri, marittimi, aerei ed impianti di risalita.\n\nV/2 \u2013 Stabilimenti balneari e di acque curative.\n\nV/3 \u2013 Fiere permanenti, recinti chiusi per mercati, posteggio bestiame e simili.\n\nV/4 \u2013 Fabbricati destinati all\u2019esercizio pubblico dei culti, cappelle ed oratori.\n\nV/5 \u2013 Ospedali.\n\nV/6 \u2013 Fabbricati, locali, aree attrezzate per esercizi sportivi e per divertimento, arene e parchi zoo.\n\nV/7 \u2013 Unit\u00e0 immobiliari a destinazione pubblica o di inte"} +{"output_text": " captain to be: a no-nonsense, no-bullshit, no-nonsense-at-all kind of guy.\n\n\u201cHe\u2019s a very good captain,\u201d says Tusha. \u201cHe\u2019s very professional, very organized, very organized, very smart, very smart, very good with people, very good with the crew, very good with the guests, very good with the guests, very good with the guests, very good with the guests.\u201d\n\nRos", "input_text": " who behave outrageously at nightclubs or music festivals, but because the crew actually had to operate a boat, Below Deck\u2019s producers couldn\u2019t just hire any unhinged extrovert eager for fame. \u201cYou can\u2019t fake being a yachtie, because it\u2019s a tough job,\u201d says Cronin. The cast would also need proper safety certifications and training. So with Gang\u2019s help, Cox and his team found people by going to crew houses, where workers stay between jobs, and spreading the word in Fort Lauderdale while shooting the sizzle reel.\n\nProducers had initially wanted the show\u2019s captain to be young and handsome. They thought they\u2019d found their man in Aleks Taldykin, a yachting pro who auditioned because the economy was still in a postrecession slump and he needed a job. But according to Simon Tusha, who owned the charter company that rented the boat to 51 Minds, the yacht\u2019s owner was uncomfortable with Taldykin taking the helm, even though he was technically qualified. Right before filming began in St. Martin, producers and the owner asked Rosbach. He agreed because his boss asked him to.\n\n\u201cLee was really pissed, which was really funny because now he loves it,\u201d says Taldykin, who thought about quitting the show before producers persuaded him to stay on as Rosbach\u2019s onscreen No. 2. (Taldykin left after one season and now runs his own yacht-charter company.) The switch ended up being a brilliant decision. After Below Deck\u2019s 2013 debut, Captain Lee immediately became its biggest star when he kicked the show\u2019s first charter guests off the boat after a crew member discovered a white powder and rolled-up bill in one of their cabins. Rosbach was exactly what audiences wanted a"} +{"output_text": " ageing.\n\nThe authors conclude that \"the clonal expansion of mtDNA mutations is the major driving force behind the mitochondrial dysfunction associated with ageing.\"\n\nThe authors of the study, published in the journal Nature Communications, are from the University of Cambridge, the University of Oxford, and the University of Edinburgh.\n\nThe study was funded by the Wellcome Trust, the Medical Research Council, the Biotechnology and Biological Sciences Research Council, the Engineering and Physical Sciences Research Council, the National Institute for Health Research", "input_text": " quiet, they are many. That's the context to those comments from coach and colleagues: They're aware that Oblak could go and, more importantly, aware of what that would mean. \"I don't know any champion without a great goalkeeper,\" Simeone said.\n\nAtletico have a great goalkeeper, maybe even the greatest of them all. And they're desperate to do what he does better than anyone else: hold on tight.\n\nAs club captain Koke put it: \"If I was in charge here, I'd give him a blank cheque.\" Age-related decline in the integrity of mitochondria is an important contributor to the human ageing process. In a number of ageing stem cell populations, this decline in mitochondrial function is due to clonal expansion of individual mitochondrial DNA (mtDNA) point mutations within single cells. However the dynamics of this process and when these mtDNA mutations occur initially are poorly understood. Using human colorectal epithelium as an exemplar tissue with a well-defined stem cell population, we analysed samples from 207 healthy participants aged 17\u201378 years using a combination of techniques (Random Mutation Capture, Next Generation Sequencing and mitochondrial enzyme histochemistry), and show that: 1) non-pathogenic mtDNA mutations are present from early embryogenesis or may be transmitted through the germline, whereas pathogenic mtDNA mutations are detected in the somatic cells, providing evidence for purifying selection in humans, 2) pathogenic mtDNA mutations are present from early adulthood (<20 years of age), at both low levels and as clonal expansions, 3) low level mtDNA mutation frequency does not change significantly with age, suggesting that mtDNA mutation rate does not increase significantly with age, and 4) clonally expanded mtDNA mutations increase dramatically with age. These data confirm that clonal expansion of mtDNA mutations, some of which are generated very early in life, is the major driving force behind the mitochondrial dysfunction associated with"} +{"output_text": " grid is a national asset, less likely to be seized by insurgents. Saudi Arabia and Kuwait could also help Iraq by providing it with the expertise it needs to develop its own electricity sector.\n\nThe GCC\u2019s role in Iraq is not without risks. The GCC\u2019s support for Iraq is not unconditional. The GCC\u2019s support for Iraq is not unconditional. The GCC\u2019s support for Iraq is not unconditional. The GCC\u2019s support for Iraq is not unconditional. The GCC\u2019s", "input_text": " the country, continues to target Iraq\u2019s national grid. Given Iranian opposition to GCC assistance, there is also a possibility that Iran-allied militias could sabotage the power lines \u2014 although there is no clear evidence that they have attacked power infrastructure thus far. Iraq also has a serious problem with electricity theft; in 2015, the Ministry of Electricity collected payments for just 12 percent of the electricity it produced.\n\nSaudi Arabia and Kuwait could alleviate the threat of sabotage by asking Iraq to hire private security to guard new power infrastructure. As a failsafe, both countries will be able to cut the power if the infrastructure is seized by ISIL or Iran-allied militias. Electricity theft, which targets local low-voltage networks, is a thornier problem. Saudi Arabia and Kuwait could help mitigate, although certainly not eliminate, this problem by providing Iraq with electricity meters and perhaps reconfiguring the ministry\u2019s utility payments system along the lines of the Saudi Electricity Company\u2019s highly successful model.\n\nAn Iraq\u2013GCC interlink will give Saudi Arabia the Iraqi foothold it has long sought. In so doing, it can serve as one possible geoeconomic alternative to traditional Saudi checkbook diplomacy \u2014 the exchange of economic aid and investment for political concessions \u2014 which has yielded diminishing returns for the kingdom. For example, while billions of dollars in oil and cash may have helped Saudi Arabia reacquire the islands of Tiran and Sanafir from Egypt, it did nothing to prevent Cairo from opposing Riyadh at the United Nations or sending just 800 troops to back the kingdom\u2019s war in Yemen. Similarly, keeping Lebanon\u2019s economy solvent has not diminished Hizballah nor has bankrolling friendly politicians in Iraq done much to advance GCC interests.\n\nUtility exports like electricity are more cost-effective, less prone to graft, and, because the electric"} +{"output_text": " bowl is lifted to reveal a pile of shredded pork.\n\nThe menu is a mishmash of Italian and American dishes, with a few nods to the local cuisine. A plate of fried calamari is topped with a pile of fried chicken wings, and a plate of fried calamari is topped with a pile of fried chicken wings. A plate of fried calamari is topped with a pile of fried chicken wings. A plate of fried calamari is topped with a pile of", "input_text": "\n\nAn adult dining companion returns with a plate of meat and fish so ghastly that it appears to have been attacked by raccoons, and announces, \"Surf and turf!,\" and the table dissolves into laughter.\n\nMeats, vegetables, and noodles on the grill. The Boston Globe/Globe Freelance\n\nA girl who couldn't have been older than 10 sets her bowl on the counter for one of the cooks. He stoops slightly to talk to her, then invites her back behind the counter to hold a spatula and give her food a few whacks. When he goes on vacation, he tells the beaming child, she'll have to come fill in.\n\n\nOn some level \u2014 a level that actually matters \u2014 a cook who is that sweet to a child is not a bad cook. So how can a restaurant that is filled to capacity with people unapologetically enjoying themselves be a bad restaurant?\n\nFor all the joy on display here, FiRE + iCE seems hellbent on finding out.\n\nA small selection of appetizers, mostly deep-fried and prepared by the kitchen rather than on the flat-top, is something less than an afterthought. Seemingly straightforward Buffalo calamari \u2014 deep-fried calamari with Buffalo wing sauce and blue cheese dressing on the side \u2014 is aggressively terrible, the squid so tough that chewing it into submission is a challenge. Sweet potato tots have either been fried in oil no hotter than bathwater or removed far too quickly, turning them into a sad, sodden clod of orange gnocchi.\n\n[Love letters: After divorce, why can't I find 'the one?']\n\nA carefully assembled vegetable Alfredo bowl with peas ends up with a stranger's pork chops hitching a ride \u2014 a mismatch only uncovered when the"} +{"output_text": " I was still in college. I have been tutoring students for the past two years and have helped over 100 students prepare for the SAT and ACT. I have also helped students prepare for the PSAT and the SSAT. I have been tutoring students for the past two years and have helped over 100 students prepare for the SAT and ACT. I have also helped students prepare for the PSAT and the SSAT. I have been tutoring students for the past two years and have helped over 100", "input_text": " smart cameras that sent video feeds to her phone.\n\nWilson eventually apologized for his behavior and she asked him to join her in Knoxville where she was spending the holiday with her family and her children. He arrived there on Dec. 21. The next evening, however, he suddenly got up while they were watching television around midnight and left. In the morning she began to receive text messages from Wilson, who insisted that she was cheating on him.\n\nShe said Wilson didn't return until about 4:30 p.m. Dec. 24, some 40 hours later. In that time, she received an alert on her phone that the smart cameras in the Palm Coast home were disconnected. Wilson also made comments that made her believe that he had visited the home during his absence, including inquiring about the location of the firearm and cameras. He also advised her \"to use the front door of the home because the garage door isn't going to open.\"\n\nShe told her stepfather about the comments and he checked the home before she returned to Florida.\n\nWilson was arrested on a warrant at 12:30 p.m. Thursday in Knoxville. He was being held Friday on $150,000 bail and will be extradited to the Flagler County Detention Facility.\n\nA Facebook page belonging to Wilson shows him gripping two handguns, and he boasts that he \"is the guy that your father warned you about.\" It also lists him as \"widowed.\" Hello! Who are you and what business did you start?\n\nHey everyone! My name is Adam Shlomi and I am a 22-year-old senior at Georgetown University from South Florida. I am the founder of SoFlo SAT Tutoring, an online SAT/ACT tutoring company that provides exceptional test prep to students across the country. I started this business in my bedroom almost a year ago while"} +{"output_text": ". Il titolo \u00e8 \u201cLa scomparsa di Daphne Caruana Galizia: un colpo alla democrazia\u201d. Il testo \u00e8 molto interessante e mi piacerebbe leggerlo. Ma non posso perch\u00e9 non conosco il dottor Tornago.\n\nAvv. Massimo Malvestio\n\nEgregio Direttore, leggo oggi un articolo firmato", "input_text": " dichiarazione al Gazzettino: \u201cQuale corruzione? Malta e\u2019 un isola felice. Qui si vive benissimo. Il delitto \u00e9 maturato nell\u2019ambito del traffico di droga e petrolio. Lo stato non c\u2019entra nulla\u201d. In realt\u00e0 quel che si e\u2019 virgolettato era il riassunto non esattamente preciso giacch\u00e9 la specifica parte dell\u2019 intervista e\u2019 questa: Che idea si \u00e8 fatto dell\u2019omicidio Caruana? \u00abNon ho un\u2019idea definitiva, ovviamente, ma per le modalit\u00e0 con cui si \u00e8 consumato questo efferato delitto, la pista pi\u00f9 plausibile porta a pensare che la giornalista si fosse messa in rotta con i trafficanti di droga e petrolio. Probabilmente le sue indagini sono andate a toccare questo tipo di traffici ed interessi\u00bb. Su Daphne:\u201d La scomparsa di Daphne Caruana Galiza \u00e8 un colpo molto duro alla democrazia. La sua era una voce di libert\u00e0 autorevole.\u201d Certo poi ho difeso Malta ed il popolo maltese: ho dieci dipendenti maltesi, persone ottime che non possono essere infangate per la condotta di criminali.\n\nAvv. Massimo Malvestio\n\nEgregio Direttore, leggo oggi un articolo firmato dal mio compagno di corso all\u2019universit\u00e0 di Padova, dottor Paolo Biondani e dal dottor Andrea Tornago che invece non conosco"} +{"output_text": " out for me.\n\nI really liked the local approach of last year\u2019s WCS, I think it was a good idea to have the players in the same place. I think it\u2019s a good idea to have the players in the same place for the next season as well.\n\nI think it\u2019s a good idea to have the players in the same place for the next season as well.\n\nI think it\u2019s a good idea to have the players in the", "input_text": "So I have to start with the obvious question and the one I wanted to ask since day one of WCS: Why choose North America before Europe?\n\nNA is the region that allows you to travel and play online with too many latency issues, EU was not an option to me because I would be living in Korea most of the time and KR-EU lag is too much.\n\nWas your decision dictated by the overall level of competition and do you regret the choice seeing how it\u2019s basically \u201cGSL: The spin-off\u201d already?\n\nI think WCS NA was the best choice for me, preferrably I would\u2019ve liked to play in the old GSL but I think from a progamer perspective it wouldn\u2019t make sense for me to choose WCS KR right now.\n\nThere\u2019s also this moment where WCS Europe\u2019s production gets a way better outreach compared to America\u2019s, the finals particularly broke the 100K mark if I recall correctly. Is sacrificing such exposure worth it?\n\nIt\u2019s too early to say how the finals of WCS America will be, but to me WCS EU wasn\u2019t an option because of the travelling. I don\u2019t think anyone would pick a region just because it has more viewers, it\u2019s one of the less important things to consider.\n\n\n\n\u200b Photo: Carlton Beener\n\n\u200b\n\n\"I really liked the local approach of last year's WCS\"\n\nWhile we\u2019re on WCS topic, what\u2019s your take on the whole non-residency thing? Not many fans (or players for that matter) seem to like it.\n\nI preferred the old format but on the other hand it would\u2019ve caused problems for me now that I wanted to live in Korea. So personally I\u2019m happy since it\u2019s working"} +{"output_text": " was just a little kid, and she was already doing all these things.\"\n\nChou was a shy kid, or at least shyer than his sister, who was seven years his senior. \"She was doing magic shows, puppet shows, and public speaking way before I was,\" he remembers, \"I was just a little kid, and she was already doing all these things.\"\n\nChou was a shy kid, or at least shyer than his sister, who was seven years", "input_text": " this way I was so afraid of looking like an idiot,\" he recalls. \"But now I\u2019m in so deep that I can\u2019t even hear those voices.\"\n\nBlizzard uses Frodan and Brian \"Brian Kibler\" Kibler to cast the biggest Hearthstone matches.\n\nThere are Hearthstone celebrities like Kripparrian and Trump who\u2019ve cultivated huge followings through endless arena runs and a relentless streaming schedule. There are Hearthstone pros like Pavel and Purple who\u2019ve earned the respect of their peers through almost unbelievable competitive consistency and putting in marathon practice sessions with other pros. Frodan is neither. He streams irregularly and has no daily schedule, and though he's capable of finishing in the top 100 on ladder he has no tournament wins to his name.\n\nNone of which has stopped him from emerging as perhaps the most beloved, recognizable figure in Hearthstone's still young esports scene. As a broadcaster Chou is cheeky and suave, with the slight dash of acerbic wit necessary to give Hearthstone\u2019s endless bad beats and righteous draws their character. It\u2019s is a good place to be. And yet, he\u2019s not quite satisfied. \"If I just casted, and did only that, I think I\u2019d get bored and kinda sad after a while. I wanna do more things,\" says Frodan. \"So many people who do commentary\u2014over the years, they move onto other projects, largely because they realize that there\u2019s a lot more to explore with your influence than just yelling GG.\"\n\nSister act\n\nChou was a shy kid, or at least shyer than his sister, who was seven years his senior. \"She was doing magic shows, puppet shows, and public speaking way before I was,\" he remembers, \"I"} +{"output_text": "975 to $2,000 Subscription revenue $1,945 to $1,965 Billings $2,100 to $2,125 Non-GAAP gross margin 78% to 80% Non-GAAP sales and marketing 46% to 48% Non-GAAP research and development 14% to 16% Non-GAAP general and administrative 9% to 11% Non-GAAP interest and other income (expense) $(1) to $1 Provision for", "input_text": " videoconferencing. The company's platform includes several other technologies specific to remote agreements, such as video identity verification, collaborative form-filling, an integration with eSignature, and a detailed audit trail. plans to leverage Liveoak's technology and expertise to accelerate the launch of Notary, a new product for remote online notarization, where signers and the notary public are in different places. The beta release of Notary is currently slated for. CTO appointment. On August 25, 2020, DocuSign announced Kamal Hathi as its new chief technology officer (CTO). Prior to joining DocuSign, Kamal was chief product and technology officer at Trader Interactive, a leading provider of online marketplaces and products serving the lifestyle vehicles and commercial equipment sector. Before that he spent more than two decades at Microsoft, most recently as GM for its SaaS analytics and business intelligence solution, Power BI. As CTO, Kamal will oversee the development and execution of DocuSign's technology roadmap, including the expansion of the DocuSign Agreement Cloud.\n\nOutlook\n\nThe company currently expects the following guidance:\n\n? Quarter ending October 31, 2020 (in millions, except percentages):\n\nTotal revenue $358 to $362 Subscription revenue $343 to $347 Billings $380 to $390 Non-GAAP gross margin 78% to 80% Non-GAAP sales and marketing 46% to 48% Non-GAAP research and development 14% to 16% Non-GAAP general and administrative 9% to 11% Non-GAAP interest and other income (expense) $(1) to $1 Provision for income taxes $2 to $3 Non-GAAP diluted weighted-average shares outstanding 200 to 205\n\n? Year ending January 31, 2021 (in millions, except percentages):\n\nTotal revenue $1,"} +{"output_text": ", and you\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re done. You\u2019re", "input_text": "\n\nJSat\u2019s Superbird-8/DSN-1 satellite, carrying a Ku-/Ka-band commercial telecommunications payload and an X-band payload for Japan\u2019s Defense Ministry, was damaged during transit to Europe\u2019e spaceport and will require more than a year of repairs and retesting. Superbird-8/DSN-1 had been the intended Superbird-B2 replacement.\n\nFor unclear reasons, Sky Perfect JSat did not want the launch mass of JCSat-14 or JCSat-16 to be made public. But industry officials said JCSat-16, carrying an all-chemical propulsion system, weighed around 4,600 kilograms.\n\nA very crowded SpaceX manifest to the end of the year\n\nThe launch was the eighth of the year for Hawthorne, California-based SpaceX, which is confronting a list of at least nine missions whose owners in recent weeks have confirmed that they expect their launches to occur by December.\n\nSpaceX officials said early this year they hoped to conduct 18 launches in 2016 from both Cape Canaveral and Vandenberg Air Force Base, California, which is used for missions to high-inclination low-Earth orbit.\n\nThe following list of satellite owners who have informed their investors of scheduled 2016 launches is subject to change and does not include the inaugural SpaceX Falcon Heavy vehicle.\n\nSpaceX President Gwynne Shotwell said Aug. 9 in a presentation to the Small Sat 2016 conference that Falcon Heavy, originally scheduled for launch in 2013, was proving to be \u201cactually a harder problem than we thought.\u201d\n\nFalcon Heavy more challenging than expected\n\nAfter apologizing to customers for the delays, Shotwell said: \u201cI\u2019m president: There\u2019s three [Falcon 9 first stage] rockets. You glue them together"} +{"output_text": " the middle of the country, in the Sichuan province. In 1975, a magnitude 9.5 earthquake struck the same spot, killing more than 70,000 people. In the 1980s, a series of earthquakes in the same area prompted the government to build a nuclear power plant there. In the 1990s, a series of earthquakes in the same area prompted the government to build a nuclear power plant there. In the 2000s, a series of earthquakes in the same area prompted the government to", "input_text": " back thousands of years and, unnervingly, almost completely defy any pattern.\n\nStein pulled up an animation on his computer of a millennium of Chinese earthquakes. \u201cOK we\u2019re starting at the year 1300,\u201d he said. In 1303 a magnitude 8.0 lights up Hongtong in the Shanxi province of North China. \u201cSo there\u2019s this huge earthquake here and then you say, \u2018Ok well there must be this structure here, maybe the next one will be on it.\u201d And sure enough, not far away, in 1556, a magnitude 8.3 pops up to the southwest in Huaxian. \u201cThis is another huge earthquake, so then you say, \u2018Ok, this is a really dangerous area now.\u2019\u201d He clicked the next slide. \u201cAnd then this is 1668.\u201d In 1668, 500 miles to the east, a massive 8.5 earthquake pops up out of nowhere. \u201cThen you say, \u2018Oh, well we never even suspected this fault system.\u2019 So if you were building nuclear power plants for the Ming Dynasty you wouldn\u2019t build \u2018em here and you wouldn\u2019t build \u2018em here,\u201d he said, pointing to the now at least two demonstrated trouble spots. \u201cI don\u2019t know why people in California or anywhere worry so much about earthquakes. They are such a small hazard compared to things like traffic.\u201d \u201cBut then the next one\u2019s here,\u201d he said, as I followed, the dots dance from one part of the screen to the other. Only 11 years later, the threat had unpredictably migrated hundreds of miles north, to just outside of Beijing, pummeling the region with a magnitude 8 earthquake in 1679. The scattering continued to the present day, as giant earthquakes filled in the blank spaces on the map. In 1966, a sequence of earthquakes alerted officials to a new trouble spot in"} +{"output_text": "\" y \"falta de presentaci\u00f3n\".\n\nEn el caso de Oaxaca, el secretario de Educaci\u00f3n P\u00fablica, Jos\u00e9 Ram\u00f3n Coss\u00edo, dijo que el gobierno federal no tiene un plan para despedir a los maestros que no acudan a las evaluaciones.\n\n\"No tenemos un plan para despedir a los maestros que no acudan a las evaluaciones. No tenemos un plan para des", "input_text": " to begin his trial for allegedly raping a woman in 2013 and forcibly performing oral sex on another in 2006.\n\nHe faces five felony charges based on the claims by two women, one of whom remains anonymous. Several other women who say he assaulted them will also testify, including \"The Sopranos\" actress Annabella Sciorra, as prosecutors seek to prove he committed sex crimes against multiple women.\n\nJury selection in the New York case is expected to begin Tuesday and last for two weeks, followed by arguments and testimony expected to last another 8 weeks.\n\nIn emails to CNN prior to the trial, Weinstein said that he has learned to self-reflect over the past two years.\n\n\"The past two years have been grueling and have presented me with a great opportunity for self-reflection,\" Weinstein wrote. \"I realize now that I was consumed with my work, my company and my drive for success. This caused me to neglect my family, my relationships and to lash out at the people around me. I have been in rehab since October 2017, and have been involved in a 12-step program and meditation. I have learned to give up my need for control.\"\n\nCorrection: This story has been updated to reflect the correct year accusations against Harvey Weinstein were first reported by The New York Times and The New Yorker. Rafael L\u00f3pez\n\nHasta el 10 de agosto de este a\u00f1o, ning\u00fan maestro de Oaxaca, Guerrero, Chiapas y Michoac\u00e1n ha sido despedido por no presentarse a las evaluaciones de desempe\u00f1o ni por haber acumulado faltas consecutivas.\n\nEn otros estados, donde la disidencia magisterial es pr\u00e1cticamente nula, s\u00ed hubo despidos por \"inasistencia"} +{"output_text": " all. We are so grateful to the Peter McVerry Trust for their work and to all the artists who have contributed to this album. We hope you enjoy it. X https://t.co/0ZwAKA6uzq \u2014 Saint Sister (@Saint_Sister_) December 6, 2018\n\nThe album is available to pre-order now from the following retailers:\n\nCD:\n\nAmazon: https://www.amazon.com/dp/B07H6", "input_text": "Irish music blog The Last Mixed Tape selected Saint Sister\u2019s debut album, The Shape of Silence, as their album of the year. The album was co-produced by Hozier\u2019s bassist and Musical Director, Alex Ryan.\n\nThank you to @TheLastMixTape for choosing 'Shape of Silence' as their record of the year. It is such a scary thing to release a big body of work into the world, especially a debut, so this really means the world to us. What brilliant company too! We are shook. X https://t.co/0ZwAKA6uzq \u2014 Saint Sister (@Saint_Sister_) December 6, 2018\n\nSomeone New co-writer, Loah, is releasing one-remixed track from her debut EP, This Heart, every Wednesday on Nialler9\u2019s blog. This week\u2019s track is taken from an interlude on the EP and expanded into a full song, Keep Your Heart, featuring God Knows from Rusangano Family, Bantum, Ben Bix and Dunny. Loah was also featured in a video from Other Voices after her appearance there last week.\n\nSaint Sister, Loah, Ro\u00edsin O, Bono, and many more artists are featured on a cover of Homeward Bound, the single from the album, Street Lights, to raise funds for the Peter McVerry Trust which works to help the homeless in Ireland.\n\nAlso on the album is Saint Sister\u2019s cover of The Cranberries, Dreams, recorded live in Dolan\u2019s Limerick.\n\nThis album is out today, featuring lots of incredible artists and for a brilliant cause. We are so delighted to have our version of \u2018Dreams\u2019 (live from Dolans, Limerick) by The Cranberries in amongst it"} +{"output_text": " register of the court.]\n\nThe Supreme Court of Canada\n\nThe Supreme Court of Canada\n\nThe Supreme Court of Canada is the highest court in Canada. It is the court of last resort for all matters in the federal court system. The Supreme Court of Canada is composed of the Chief Justice of Canada and the other judges of the Supreme Court of Canada.\n\nThe Supreme Court of Canada is composed of the Chief Justice of Canada and the other judges of the Supreme Court of Canada.\n", "input_text": " unqualified\" to run the country.\n\nHe however, like many others, was forced to congratulate the leader after his presidential win.\n\nOn November 9, in front of a crowd of people and acknowledging the differences between himself and Trump, he spoke of working hard to ensure a healthy transition would be achieved.\n\n\"It is no secret that the President-elect and I have some pretty significant differences,\" Obama said with Vice President Joe Biden at his side.\n\nWe are now all rooting for his success in uniting and leading the country.\"\n\nDespite protests taking place across America after Trump claimed presidential victory, the next leader of the US will be Donald Trump.\n\nPoliticians made a joke of his candidacy, calling him unqualified, racist and sexist. He may have offended people, creating controversy and causing division, but political leaders around the world have little choice and will have to work with Donald J. Trump as the 45th President of the United States of America. (1) Appeal allowed in part. (2) Set aside orders 1, 2 and 3 of 23 April 2018 of the Supreme Court and, in lieu thereof, order that: (a) the first, second and third appellants pay to the Registrar of the Court a fine for contempt of court in the sum of $7,500; (b) the fourth appellant pay to the Registrar of the Court a fine for contempt of court in the sum of $2,500; and (c) the appellants pay the respondents\u2019 costs of the proceedings below on an ordinary basis. (3) The appellants to pay 75 per cent of the respondents\u2019 costs of the appeal.\n\n[Note: The Uniform Civil Procedure Rules 2005 provide (Rule 36.11) that unless the Court otherwise orders, a judgment or order is taken to be entered when it is recorded in the"} +{"output_text": " a variety of toppings, including the house-made pork belly, which is a must-order. The ramen is a little on the salty side, but the broth is so flavorful that it\u2019s not a problem. NG\n\n25. The Bistro\n\n1401 Main St., KCMO. | Moderate.\n\nThe Bistro is a cozy, intimate restaurant with a small, but well-curated wine list. The menu is a", "input_text": " Rieger\n\n1924 Main St., KCMO. | Expensive.\n\nIf you sit at the end of the chef\u2019s counter in front of the kitchen at The Rieger, you\u2019ll be able to spot a slogan painted in cursive over the expo station, facing the line: \u201cBeautiful food for the people.\u201d This is chef Howard Hanna\u2019s mantra, one he\u2019s stuck to since opening Rieger in 2010. But at Rieger \u2014 named for the historic hotel building erected in 1915 in the Crossroads, which the restaurant calls home \u2014 beautiful food doesn\u2019t mean fussy. Hanna\u2019s dishes are layered and thoughtful. He\u2019s known for writing novella-length menu notes for his staff with every change, delving into the local farms he\u2019s sourced the flora and fauna from. But his dishes are also rustic and familiar. The handmade pastas are always winners \u2014 the mafaldine with local mushrooms and Pecorino is exceptional \u2014 and there would be riots if Hanna\u2019s beloved pork soup, with its toasted blanket layer of Gruyere, ever left the menu. NG\n\n24. Shio Ramen Shop\n\n3605 Broadway Blvd., Midtown, KCMO. | Inexpensive.\n\nSit at the three-seat bar at the tiny Shio Ramen Shop in Midtown and one of the first things you\u2019ll notice is the behemoth metal contraption behind the counter taking up considerable space that, in any other restaurant, would have been prime spirit storage. This is chef-owner Patrick Curtis\u2019 prized Yamoto noodle machine, which he employs daily to produce his impossibly bouncy, chewy ramen noodles. These wavy angel hairs go into a number of rich broths and are topped with"} +{"output_text": " best way to get a full picture of the property, but it can be a good way to get a quick look at what\u2019s going on around the property.\n\n15. What Are the Current Zoning Laws?\n\nIf you\u2019re considering a vacant lot zoned for commercial development, make sure you\u2019re not inheriting any environmental contamination with the property.\n\nFor most commercial properties, the best way to do this is by ordering a \u201cPhase I Environmental Report\u201d (", "input_text": " ask them if they wouldn\u2019t mind driving by the property and snapping a few pictures when they have a chance. Most agents are regularly in the field anyway, and it isn\u2019t a huge ask for them to swing by your property and get some pictures (especially if you show an interest in using them for your future listings and/or paying them a few bucks for their trouble).\n\nHowever you decide to get a look at the property \u2013 make sure you are 100% clear about where the property is located, and communicate this thoroughly to the person getting pictures (send them parcel maps, directions, even a video if you need to).\n\n14. What Were the Previous Uses of the Property?\n\nMost states have environmental laws that pertain to commercially zoned property (i.e. \u2013 properties zoned \u201cResidential\u201d generally aren\u2019t held to these standards). If you\u2019re considering a vacant lot zoned for commercial development, make sure you\u2019re not inheriting any environmental contamination with the property.\n\nFor most commercial properties, the best way to do this is by ordering a \u201cPhase I Environmental Report\u201d (many banks will automatically require this because it affects their collateral). This report will identify if there are any \u201cRecognized Environmental Concerns\u201d (RECs) on the property that you need to worry about. If you neglect to do any environmental due diligence, the liability for any existing environmental contamination on the property could ultimately fall on your shoulders \u2013 making it very difficult and expensive to sell the property in the future.\n\nAnother way to get a quick look at what has been going on (and around) the property in question is to look at the historical satellite imagery of the property with a tool like Google Earth.\n\nHere\u2019s a quick look at how it works\u2026\n\nAs you can see, it\u2019s not necessarily the"} +{"output_text": "s early history was a lot more complicated than we thought.\n\n\u201cThe problem is that the early Sun was fainter than it is today,\u201d said planetary scientist David Catling, who studies the early history of the Solar System at the University of California, Santa Cruz. \u201cIt was fainter than it is today, but it was also cooler. So the early Sun was not as hot as it is today.\u201d\n\nThe faint young Sun paradox is a problem for scientists because it means that", "input_text": ", they took the monkey. Bieber would later concede, \u201cIt was the farthest thing from fine.\n\nMally found a new home in a German zoo but had trouble adjusting. Like many primates who wind up as pets, he was taken from his mother at a too early age. \u201cThese little guys are taken from their mothers within days after birth,\u201d Dunnigan said \u201cThat is in and of itself one of the most traumatic things that can happen to them.\u201d\n\nDunnigan added: \u201cIn the wild, they stay attached to their mothers and don\u2019t leave her body for almost an entire year after they\u2019re born.\u201d After four and a half years of exploring Mars, NASA\u2019s Curiosity rover has made a new discovery that only deepens a long-standing mystery about the Red Planet \u2014 namely, how the world used to be so wet. Pretty much all Mars scientists agree that billions of years ago the planet had flowing rivers and lakes on its surface. But there\u2019s a problem: no one can quite explain how ancient Mars was warm enough back then to support liquid water. And Curiosity is unearthing clues that only make things more confusing.\n\nCuriosity is unearthing clues that only make things more confusing\n\nSince its landing in a region called Gale Crater, the rover has found critical signs that liquid water once pooled on the Martian surface. Curiosity has been scouring over hundreds of meters of sedimentary rocks that are thought to have been deposited by a lake that existed in the crater 3.5 billion years ago. But there\u2019s an issue with timing: back when Mars supposedly had water on its surface, the Sun wasn\u2019t cranking out that much heat. It\u2019s a conundrum known as the \u201cfaint young Sun paradox,\u201d and it\u2019s the idea that our Solar System\u2019"} +{"output_text": " them, I found that he was in a league with a bunch of people who were in the top 100K. I\u2019m not sure if that\u2019s a coincidence or not.\n\nThis is a guy who I\u2019ve never heard of before, but he\u2019s in the top 100K in a league with a bunch of people who are in the top 100K. I\u2019m not sure if that\u2019s a coincidence or not.\n\nThis is a guy who I\u2019", "input_text": "serious\u201d FPL players probably regard this as a trivial little side game, but I think it\u2019s just a weekly delivery of a whole new box of puzzles. Each week we get just a name and a team page. Who is this dude? Is it a dude? Is it a famous dude? Is it a grumpy middle school history teacher from Newcastle? Is it the future Australian Men\u2019s National Team Coach? Is it a stripper/model/waitress/actress from South Africa? Is it a high school cricket player from somewhere in Ireland? The internet is a wonderful thing and with a few clicks here and a few clicks there, voila!, your opponent is no longer a random name but is a flesh and blood, real life person with whom you are locked in mortal combat.\n\nI\u2019ve been doing this thing for a few years for my minileague and wanted to share the fun. I think I probably find about 2/3 of the opponents with pretty high degree of certainty. And when I can\u2019t find them, I sure don\u2019t let the truth get in the way of a good story. Here\u2019s what I dug up this week on the Cup opponents for the FMLFPL crew. Team pages are linked.\n\nThis is Sayantan\u2019s sixth year playing FPL, vs. Walsh\u2019s fifth, but the Indian gent has yet to crack the top 150K. He played a Free Hit Chip back in GW12, but it really doesn\u2019t look like he made very much of it as he only changed about five players. Weird. His is a much more common name than you might initially think (unless you\u2019re Indian and it probably seems like Sean Thompson, or something like that), and the search took a while. He\u2019s in a bunch of leagues, but when I looked at the smallest of"} +{"output_text": "ers say? I mean, it\u2019s like, \u201cOh, she\u2019s a good shooter, but she\u2019s not a good shooter.\u201d And I\u2019m like, \u201cNo, she\u2019s a great shooter. She\u2019s a great shooter.\u201d And I\u2019m not just saying that because I\u2019m a fan of her. I\u2019m saying that because I\u2019m a fan of the game. And I\u2019m saying that because I\u2019m a fan of the game", "input_text": " But I don\u2019t think it was seen by most people in D.C. as the game-changer that real WNBA heads knew it was. The real fans, like myself \u2014 we knew: Elena was the missing piece we\u2019d been waiting for. The Mystics were about to be a problem.\n\nElena was the missing piece we\u2019d been waiting for. The Mystics were about to be a problem.\n\nAnd it\u2019s been amazing to watch. She\u2019s just one of the greats of the greats. The thing about Elena is \u2014 she\u2019s a mismatch every night. Because she\u2019s a legit big, with natural size and length, but she also straight up has the skills of a guard. And I don\u2019t mean she has \u201cguard skills\u201d like she can do 70% of what a guard can do, or 80%\u2026.. I mean, she literally has all the skills that an elite guard has. Not just the jumper, which is lights out. Or the free throws, which \u2014 Elena is probably the best free throw shooter in the world, WNBA or NBA or wherever. But she also has this low-key craftiness with the ball that I think a lot of people don\u2019t realize. She uses the angles on the floor to see outcomes before they happen, like very few players are able to do. She gets to her spots right on time, she knows how to draw the perfect foul, she knows how to finish through contact \u2014 all those little things, they really just come down to spotting angles. And Elena does that at a genius level.\n\nAlso, 50-40-90?! Is she kidding with those numbers?! What I love about the 50-40-90 Club is that it\u2019s an accomplishment where you\u2019re just like\u2026.. what can the hat"} +{"output_text": " winner, he has coached at the highest levels of college football.\n\nHe has been a part of the winningest coaching staffs in the country.\n\nHe has been a part of the winningest coaching staffs in the country.\n\nHe has been a part of the winningest coaching staffs in the country.\n\nHe has been a part of the winningest coaching staffs in the country.\n\nHe has been a part of the winningest coaching staffs in", "input_text": " Coach Gilmore has coached numerous NFL players, draft picks & All-Americans, and was named the 2011 WR Coach of the Year when he was at USC. Welcome Coach Gilmore! pic.twitter.com/lBy0p5Zo0w \u2014 Michigan State Football (@MSU_Football) February 21, 2020\n\nMichigan State made additional staff announcements on Friday with Geoff Martzen named director of player personnel, Cody Cox named director of football operations and former standout Spartan linebacker Darien Harris named director of player engagement.\n\nTucker, who was a defensive back at Wisconsin from 1990-94, was hired last week as Michigan State\u2019s new head coach after spending last season at Colorado and replaced Mark Dantonio. His six-year contract includes a $6 million pool to spend on hiring his 10 on-field assistants.\n\nThe addition of Gilmore gives the Spartans seven assistant coaches officially in place. The program announced on Saturday that Mike Tressel and Ron Burton will remain with the team as carryovers from Dantonio\u2019s staff. Burton will coach the defensive line while Tressel, who was the defensive coordinator and linebackers coach last season, doesn\u2019t have an announced role yet. Chris Kapilovic, who spent last season on Tucker\u2019s staff at Colorado, was announced Monday as the Spartans\u2019 new offensive line coach and run game coordinator. Courtney Hawkins, a former Spartan standout and Flint Beecher coach and athletic director, was named the new wide receivers coach on Wednesday night. That was followed by Thursday\u2019s announcements that former Colorado assistant Jay Johnson will be the offensive coordinator and quarterbacks coach while former Spartan All-American and assistant Harlon Barnett returned as the defensive backs coach.\n\nCoach Gilmore is a true difference maker. \ud83d\udc4a\ud83c\udffe\n\nA proven"} +{"output_text": ", but Resident Evil 4 was a game that was a bit of a surprise. It was a game that was a bit of a departure from the series, and it was a game that was a bit of a departure from the horror genre. It was a game that was a bit of a departure from the horror genre.\n\nThe game was released on January 25th, 2005, and it was a game that was a bit of a departure from the horror genre. It was a game that", "input_text": " after an unfortunate scooter accident, I fell, broke my wrist, and landed myself in the hospital. It wasn\u2019t a very traumatic experience, and when I went to the orthopedist he asked what color cast I wanted. I said I didn\u2019t care, but I did have one request that sort of baffled him at first. I asked him if he could cut some extra room around my thumb, so I had more dexterity. I also wanted my fingers to be more exposed as to move them around more. He complied, and asked why this was so, since it was a very unusual wish. I stated emphatically and unequivocally: \u201cResident Evil 4 comes out next week.\u201d\n\n2005 was that awkward period in the gaming industry (and in my life), the one where a new generation of consoles was on the horizon, and things slowed down as the future loomed bright and mysterious. The Xbox 360 had yet to be formally introduced, and publishers were releasing the last great games of the Gamecube and PS2 era. We were just coming out of one of the greatest holiday seasons ever, with Sly 2, Burnout 3, Katamari Damacy, GTA: San Andreas, Ratchet & Clank: Up Your Arsenal, Halo 2, and Half-Life 2 all being released in a four month period at the end of 2004. I don\u2019t think anyone was expecting to have their minds blown so early in the new year, but Resident Evil 4 was something that surprised a lot of people.\n\nThe franchise, at that point, has seen a slew of spin-offs and prequels. Resident Evil Zero, Dead Aim, Outbreak, and Outbreak File #2 all weren\u2019t exactly titles that met the quality of previous installments. Expectations were high, and they always are with numbered sequels"} +{"output_text": " Japanese Baseball Research Journal.\n\n\"It's a lot more intense,\" Graczyk said. \"It's a lot more intense than the American game. It's a lot more intense than the Japanese game. It's a lot more intense than the Korean game. It's a lot more intense than the Chinese game. It's a lot more intense than the Taiwanese game. It's a lot more intense than the Philippine game. It's a lot more intense than the Vietnamese game. It", "input_text": " at 9:30 a.m., but after a five-minute walk from the train station, I couldn't believe what I saw as I approached the Dome. Several hundred people -- some of whom had appeared to have camped out the night before -- were in an orderly line snaking up to a ticket window that hadn't even opened yet.\n\nThey were hoping to get standing-room tickets for that afternoon's game. Only about half of them would eventually get inside the stadium, and even then, their view of the action would be blocked by so many people standing in front of them, they would end up watching the game on one of the many TVs located throughout the Dome's concourse.\n\nTrue, it was a holiday weekend in Japan and the Giants and Tigers are two of the country's most popular teams. But I had not expected to see this level of interest for a regular-season game just three weeks into the season between the third- and fifth-place teams in a six-team league.\n\nOnce I went inside the Dome, I was even more surprised at the depth of the fans' passion for a game the baseball-crazy Japanese call yakyu. From the first pitch through the final out, the intensity of the fans was unlike anything I've ever experienced at an American baseball game. I wasn't sure if it felt more like being in the stands at an SEC football game or a zealot-packed political rally.\n\n\"It's much noisier than Major League Baseball games,\" said Wayne Graczyk, the dean of American baseball writers in Japan, who I met after the game for pizza and beer at a sports bar near the Dome. Graczyk first began writing about Japanese baseball in 1975 and is now a baseball columnist for the English-language Japan Times. He also is the longtime editor of the"} +{"output_text": "land en de VS hebben aangesloten bij de oproep van Guaid\u00f3.\n\nDe Nederlandse regering heeft inmiddels een beleid dat is gericht op de \u2018humanitaire hub\u2019 in Cura\u00e7ao. Het is een van de drie landen waaruit de Nederlandse ambassadeur in Venezuela, Jeroen van der Veer, de vluchtelingenstromen naar Nederland kan", "input_text": " wettig staatshoofd. Venezuela is via Aruba, Bonaire en Cura\u00e7ao een buurland van Nederland. In de strijd tegen de drugssmokkel wordt intensief samengewerkt. \u201eDat alles loopt nu risico\u201d, zegt Gil tijdens tijdens een gesprek in Den Haag.\n\nAfgelopen week was Gil al in Itali\u00eb, Griekenland en Noorwegen, het handjevol Europese landen dat Maduro\u2019s concurrent, Juan Guaid\u00f3, niet heeft erkend. Later deze week volgt Rusland, de hand die Venezuela momenteel voedt. Nederland, waar Gil zondag aankwam, zit in dit rijtje vanwege de \u2018ABC-eilanden\u2019. Vorige week werd bekend dat Cura\u00e7ao een \u2018humanitaire hub\u2019 wordt voor het door tekorten geplaagde Venezuela. Op verzoek van Guaid\u00f3 en de VS, zei minister Stef Blok (Buitenlandse Zaken, VVD). Het leidde tot vragen, van SP en regeringspartij D66, want waar zijn de Verenigde Naties en het Rode Kruis in dit verhaal?\n\nPremier Rutte stond vorig jaar september tijdens de Algemene Vergadering van de VN uitvoerig stil bij het belang van internationale samenwerking. Ten aanzien van Venezuela lijkt vooral sprake van nauwe Nederlands-Amerikaanse afstemming. In de Kamer zei Blok dat dit zo is, omdat de VN Neder"} +{"output_text": "the order of the Knights of St John of Jerusalem).\n\nThe road is also the main artery for the Syrian army\u2019s supply lines to the north, which are based in the nearby town of al-Sukhna. The rebels have been trying to cut the road for months, but the Syrian army has been able to keep it open.\n\nThe rebels have also been trying to cut the road for months, but the Syrian army has been able to keep it open.\n\n", "input_text": " If you decide to buy an LTM (Lifetime membership - which gives you an 80% reduction in fees and other premium features) i will rebate 10% of your membership costs in BTS. The battle lines of the Syrian civil war are edging closer to Krak des Chevaliers, the most famous Crusader castle ever built. The massive walls and towers of the great fortress on its hilltop glistened white in the sunshine yesterday, as the Syrian Army fought rebels in the valleys below.\n\nThe rebels hold the castle and the two nearby villages of al-Zara and al-Hosn while much of the rest of this area, 25 miles west of Homs city and just north of the Lebanese border, is inhabited by Christians who support the government. The 13th century castle was damaged by a Syrian air force attack and mortars last year and the Syrian government says it is eager to prevent further damage.\n\n\u201cWe launched an operation to retake this area last week,\u201d the governor of Homs, Talal al Barazi, told The Independent. He said that so far the army had taken 50 per cent of al-Zara \u201cand we think the rest of it will be in our hands within a week.\u201d Syrian army officers on the spot were more cautious on how long the fighting was going to last, saying it might be a week or two.\n\nDownload the new Independent Premium app Sharing the full story, not just the headlines\n\nThe reason why the Syrian army is attacking has less to do with Krak des Chevaliers\u2019 strength as a defensive position and more to do with strategic importance of the area in which it stands. This commands the main road between Homs and Tartous on the coast, just as it did in the 13th century when the castle was rebuilt in its present form by the Knights Hospitaller ("} +{"output_text": " for its \u201cunwillingness to take a stand against segregation in the medical profession.\u201d The AMA\u2019s response was to deny that it was racist. \u201cThe AMA has never been a racist organization,\u201d the association\u2019s president, Dr. John W. Hinckley, declared in a 1966 speech. \u201cWe have never discriminated against any race or any group of people.\u201d\n\nThe AMA\u2019s racist history is not a secret. In the early 1960s, the", "input_text": "ickness, were employers and commercial health insurers, who amplified the AMA\u2019s agenda by constructing their \u201cself-interest as the public interest\u201d in concert with one another, presaging coalitions like the Partnership. The movement for national health insurance picked up steam again after World War II as labor unions pushed a single-payer plan and Harry Truman declared \u201chealth security for all\u201d a top priority of his nascent presidency. Two-thirds of the American public supported Truman\u2019s proposal for a national health insurance program. An alarmed AMA stated beating the war drums. In 1948, it enlisted Campaigns, Inc., the world\u2019s first political consulting firm, to lead the attack. Armed with a $5 million war chest amassed from $25 AMA membership fees, Campaigns, Inc. oversaw the delivery of 55 million pieces of propaganda, blending Cold War hysteria with nationalistic fervor that denounced single-payer health care as \u201csocialized medicine.\u201d At the same time, the AMA recognized the potential of the burgeoning private health insurance market to keep single payer at bay. \u201cThe Voluntary Way Is the American Way,\u201d AMA doctors told patients. The campaign was \u201cthe largest political offensive ever waged against a single piece of legislation in U.S. history,\u201d Chapin writes. By the early 1950s, the movement for a national health program lay in ruins. Not content to simply fight single payer, the AMA also provided cover for Jim Crow. It continued to justify its Southern medical societies\u2019 discriminatory membership policies against African-American physicians as well as the association\u2019s refusal to take a stand against segregated medical services, which threatened the health of millions of people. Speaking to doctors with the Medical Committee for Human Rights (MCHR) in 1966, Martin Luther King, Jr excoriated the AMA for perpetuating racism in medicine and"} +{"output_text": " of the Lagos lagoon.\n\nThe Lagos lagoon is a vast, shallow body of water that stretches from the mouth of the Benin River to the mouth of the Ogun River. It is the largest lagoon in Africa and the second largest in the world.\n\nThe lagoon is a vital source of water for the city. It is also a major source of fish and a major tourist attraction.\n\nThe lagoon is also a major source of pollution. The", "input_text": "os, in the 1990s, was the ultimate dysfunctional city and an example of what happens to a society where the state is absent. At that point the state had really withdrawn from Lagos; the city was left to its own devices, both in terms of money and services.\n\nThe city was being governed by the military. But it was not cut out for governance, had no accountability and couldn\u2019t care less about planning and environmental issues. As a result it routinely disregarded existing regulations.\n\nIn the 1990s, for instance, the largest public park in Lagos\u2014the old, colonial 10-hectare Victoria Park in Ikoyi\u2014was sold as residential development land. The waterfront of the Lagos Cowrie Creek in Victoria Island was also sold for commercial development, effectively blocking direct public access to the waters and a picturesque view of Ikoyi.\n\nThe collapse of zoning all over Lagos also led to residential neighborhoods such as Victoria Island and southwest Ikoyi being converted for commercial use. The military had no reasoned response to Lagos\u2019 urban challenges. Instead, it took the decision in 1975 to establish a new capital in Abuja.\n\nThis move, which finally came to fruition in December 1991, left Lagos forlorn.\n\nAP Photos/George Osodi Slums in Lagos\n\nThe positives\n\nPositive changes have taken place.\n\nFor example, over the past 15 years the authorities succeeded in raising more taxes using money to restore basic infrastructure, expand public services and strengthen law enforcement.\n\nResearch shows that the commitment to reform the city was driven by electoral pressures as well as elite ambitions to construct an orderly megacity. The return to democracy helped to make these changes possible by enabling an elected government to work in the people\u2019s interest.\n\nImprovements includes public transport and the reclamation"} +{"output_text": " to the corporate welfare schemes that have been enacted by Republicans in the past. The ACA is a massive corporate welfare scheme that will be paid for by the middle class and working poor.\n\nThe ACA is a massive corporate welfare scheme that will be paid for by the middle class and working poor.\n\nThe ACA is a massive corporate welfare scheme that will be paid for by the middle class and working poor.\n\nThe ACA is a massive corporate welfare scheme that will be paid", "input_text": ", and the Consumer Choice Health Security Act sponsored by Sen. Don Nickles (R-Okla.) and Rep. Cliff Stearns (R-Fla.).\n\nThe half-dozen largest insurance companies favored the Clinton plan, which they helped write (consumer advocates were excluded), while smaller firms represented by the Health Insurance Association of America (HIAA) favored GOP proposals. HIAA ran the famous \"Harry and Louise\" ads against the Clinton plan.\n\nGov. Mitt Romney signed the individual mandate into law in Massachusetts in 2006, drawing praise from Senators Jim DeMint and Orrin Hatch and other Republican leaders because of the mandate's boost for private business. It was even part of a bipartisan bill co-written by Senators Bob Bennett (R-Ut.) and Ron Wyden (D-Oreg.) in 2007.\n\nSome Republicans and the Cato Institute opposed it, but there's no doubt that the individual mandate was a Republican scheme until Democrats grabbed hold of it in 2009. After that, Republicans denounced the mandate and called it socialism.\n\nInsurance companies, whose representatives attended the health-care reform panels hosted by Democrat leaders in 2009 and helped draft the ACA, knew that the new legislation was designed to provide them a massive windfall. Whether the ACA was passed or defeated in Congress, they'd be the real winners. The ACA debate was rigged from the beginning by insurance and other corporate lobbies whose profits and high overhead, burdening the US with the highest medical costs of any nation on earth, would be maintained.\n\nIn the real world, no genuine socialist would ever jump on board a bill that imposes a direct public subsidy for the financial sector. Neither can the ACA be compared with Social Security or Medicare, which are administered efficiently by government agencies with minimal overhead costs.\n\nThe ACA is far more comparable"} +{"output_text": "; if they can go to work or not; if they can go to church or not; if they can go to the grocery store or not; if they can go to the doctor or not; if they can go to the park or not; if they can go to the movies or not; if they can go to the mall or not; if they can go to the beach or not; if they can go to the bar or not; if they can go to the restaurant or not;", "input_text": " miserable country anywhere on this globe. Why should America escape my misery and pain?\n\nI\u2019d give my support to anyone who\u2019d take the AMERICAN Dream away.\n\nI\u2019d hate President DONALD J TRUMP.\n\nI\u2019d hate anyone who wants to make America great again.\n\nI\u2019d hate AMERICAN patriots and military veterans and capitalists.\n\nI\u2019d hate first responders. I\u2019d hate anyone who is courageous and does good.\n\nI\u2019d hate business owners who think they are special, who believe in freedom, who think their work ethic and sacrifice have earned them success and prosperity. I\u2019d want to wipe that smirk right off their faces. Id want them to suffer.\n\nTo accomplish all this\u2026\n\nI\u2019d want to create a pandemic just like Coronavirus.\n\nIt would magically save China\u2019s collapsing economy and prop up their evil communist party.\n\nIt would wipe away all of Trump\u2019s magical and miraculous economic achievements of the past 3+ years\u2026in a shocking, depressing, tragic month.\n\nIt would crash the stock market; collapse the oil and energy markets; kill tens of millions of jobs; close the entire US economy; shutter millions of small businesses\u2026\n\nMake virtually every American an instant welfare addict and ward of the state; eliminate Trump\u2019s 20,000-person stadium rallies; overwhelm the entire economic system; bring business owners to their knees; reward the Deep State and D.C. Swamp\u2026\n\nMake big government the only option to save a drowning nation, no longer so exceptional\u2026\n\nMake Americans feel powerless and literally turn them into sheep- willing to hide in their homes, accept government telling them if their businesses are \u201cessential\u201d or not; deciding if they can earn a living or not"} +{"output_text": " and \u2018please\u2019 and \u2018thank you\u2019 to help you communicate.\n\nIf you\u2019re not comfortable with this, you can also use a phone call or text.\n\nIf you\u2019re not comfortable with this, you can also use a phone call or text.\n\nIf you\u2019re not comfortable with this, you can also use a phone call or text.\n\nIf you\u2019re not comfortable with this, you can also use a phone call or text.\n\n", "input_text": " next day, though, it can become a problem. I highly recommend not only considering your current health and mental issues, but also your schedule for the days that follow.\n\nProblem solving these issues together can help you and your play partner become closer. It will also help them trust you more, and we all know trust is incredibly important in kink relationships. Honestly, supportive problem solving around my health issues is something that would turn me on.\n\nCommunicate!\n\nHonestly, you should know this is important already. It\u2019s vital in every situation to keep things consensual and as safe as possible. Again, it\u2019s imperative to use those safe words and movements you set up ahead of time. I recommend setting up breaks during play for bathroom visits, taking necessary, prescribed medications, to hydrate, and to check-in.\n\nAftercare\n\nI\u2019m going to be honest \u2013 aftercare may look a lot different than normal.\n\nYou may need to help us get dressed or back into a wheelchair. You might need to get us secured into leg braces. You might have to help with bathroom stuff. We may need some care in the next few days that isn\u2019t just a text or call.\n\nAftercare is going to look different depending on what your play partner\u2019s needs are. Make sure that you cover potential aftercare ideas during the negotiation process.\n\nCreate a feedback loop\n\nIf you\u2019ve played with someone, part of aftercare should be giving feedback. While most people think about this purely in regards to sex, it can be integral to play pals, too. Set up a time a few days after play to talk about your session. Ideally, this would be in-person and in a relatively sober, non-play space. Use non-violent communication techniques such as using \u2018I\u2019 statements"} +{"output_text": "4 would have been approximately 15 years old at death.\n\nThe femur of BMRP 2006.4.4 is the only known femur of a tyrannosaurid with a complete growth series. The femur of BMRP 2006.4.4 is the only known femur of a tyrannosaurid with a complete growth series. The femur of BMRP 2006.4.4 is the only known femur of a tyrannosaurid with a complete growth series.\n\nThe", "input_text": "MRP 2006.4.4 was >15 years old. The number of CGMs missing due to medullary expansion is unknown, precluding an exact age at death for BMRP 2002.4.1 and BMRP 2006.4.4. Although the number of missing CGMs could be predicted on the basis of innermost zonal thicknesses and a process of retrocalculation [e.g., (5, 10)], the variable spacing between CGMs observed in BMRP 2002.4.1 and BMRP 2006.4.4 and other tyrannosaurs (15) renders the technique unreliable in this case, and it was not attempted. Within the innermost cortex of BMRP 2006.4.4, there is a tight stacking of six CGMs (Fig. 2D). Because the CGMs remain parallel about the cortex and do not merge, they either represent a single hiatus in which growth repeatedly ceased and resumed (totaling 13 years of growth) or up to 6 years where relatively little growth occurred annually (totaling up to 18 years of growth) (9, 16). This tight stacking of six CGMs is not observed in the femur of BMRP 2006.4.4, which preserves 15 CGMs. The CGM count from the partial tibia of BMRP 2006.4.4 is questionable because the proximal sampling location away from midshaft incorporates the fibular crest, introducing associated regions of remodeling and directional growth affecting apposition interpretations. Because of this and their absence in the femur, the observed grouping of six CGMs is conservatively interpreted as a single hiatus event. Similar instances of a single hiatus represented by narrowly spaced LAGs are reported in other tyrannosauroids (15). If this grouping of CGMs instead represents 6 years of protracted growth, then BMRP 2006.4."} +{"output_text": " and was accepted into the seminary in January.\n\n\u201cI was so excited,\u201d he said. \u201cI was so excited to be here. I was so excited to be a part of this community. I was so excited to be a part of this seminary. I was so excited to be a part of this school. I was so excited to be a part of this community. I was so excited to be a part of this seminary.\u201d\n\nHe\u2019s been here ever", "input_text": "ism by his high school football coach in Wyoming, Don Julian.\n\nBut, he didn\u2019t become interested in learning more until after the death of his childhood friend, Nicholas Bazemore.\n\n\u201cThat\u2019s where my faith conversion really exploded,\u201d he said. \u201cI got involved at a local Newman Center that was right by the dorm that I was living in. And, I got plugged into a FOCUS Bible study \u2014 Fellowship of Catholic University Students. And I met some great men there that were really key in my walk with God. They taught me a lot and they got me out of the hole that I was in.\u201d\n\nIn January 2013, he went to a FOCUS conference in Orlando, Florida, which was the spark that set his heart on fire for the Catholic faith. He enrolled in classes for Rite of Christian Initiation of Adults and started taking them three to four times a week from a priest on campus. He joined the Church in October 2013, one year after his friend died.\n\nEventually, he started thinking about the priesthood and how to discern the call. He learned that men from Sioux Falls would go to St. John Vianney, so he planned a visit there last fall.\n\n\u201cSt. Thomas was actually playing St. John\u2019s [University in Collegeville] that weekend, so I got to watch that game and visit the seminary,\u201d he said. \u201cI absolutely fell in love with the place.... The brotherhood here, I could feel it instantly, and I was so attracted to it. I was so attracted to the seminary, the school, the football team. I thought, \u2018That would be amazing if I could be here.\u2019 I prayed the entire way back to South Dakota.\u201d\n\nHe made quick work of the lengthy application \u2014 taking just three days to fill it out \u2014"} +{"output_text": " allowed to work for the city for two years after leaving office.\n\n\u201cI\u2019ve been consulting for the last year and a half,\u201d said former councillor and former mayor of Toronto John Parker. \u201cI\u2019ve been doing a lot of work with the city of Toronto, with the province of Ontario, with the federal government, with the United Nations, with the World Bank, with the United Nations Development Program, with the World Health Organization, with the United Nations, with the United Nations", "input_text": " book. I did find December, January, probably February rather depressing. It was very difficult to accustom myself to not being down at city hall and helping people and working through issues,\u201d said Doucette.\n\n\u201cWhen you\u2019ve done, you know, 60, 70 hours a week for eight years and you\u2019re out there helping people, it\u2019s suddenly very different to not be doing that. I found it harder than I thought I would.\u201d\n\nThe hours are terrible \u2014 councillors work weekdays, weekends and weeknights, for an annual salary of about $117,164 a year (in 2019), plus benefits. They must be prepared to endure public wrath and ridicule, while trying to make the best decisions for a city growing tall and gangly as a teenager, with the attendant emotional outbursts.\n\nThe job application process is months long, expensive and funded by the applicant and their supporters. It requires knocking on thousands of doors to win votes, to be greeted warmly or snubbed and sometimes lectured on the ways in which the city is going to hell in a handbasket.\n\nAnd yet, every election season produces a new crop of hopefuls and a new crop of retirees \u2014 some of whom, like Doucette, have chosen to leave the job, and some of whom are pushed out by voters.\n\nIn 2018, with the province downsizing the number of city wards, more councillors than usual began packing up their bags the day after the election. We checked in with some of them to find out what life has been like since leaving 100 Queen St. W.\n\nThe number one occupation for former councillors seems to be consulting \u2014 there is no shortage of public organizations and private firms happy to hire former city councillors. However, they must tread carefully \u2014 under city rules former councillors are not"} +{"output_text": " away from the museum. She didn\u2019t want to be seen. She didn\u2019t want to be caught. She didn\u2019t want to be found.\n\nShe parked her car and walked to the museum. She didn\u2019t know what she was going to do, but she knew she had to do it. She had to get inside. She had to find the painting. She had to find the painting and destroy it.\n\nShe walked into the museum and looked around. She", "input_text": "ked up, possibly to try to hide her annoyance. \u201cMost of our pieces are on loan from artists or other museums. I don\u2019t have the authority to sell the pieces and it is against museum policy to act as a broker.\u201d\n\n\u201cIs there a volunteer there who\u2019s a teenage boy? Bad teeth? Curly hair and his jacket is a little too big?\u201d\n\nThe woman giggled. \u201cThe jacket thing could describe all of us. I swear they use gorilla sizes for those things.\u201d The woman laughed far too loudly at her own joke. Vienna didn\u2019t make a sound. \u201cWhat is his name? I\u2019ll see if he\u2019s on the roster for today.\u201d\n\nVienna made a short strained noise from her throat. \u201cI don\u2019t know his name. He was working Saturday.\u201d\n\n\u201cOh, then he\u2019s not here. The list of weekend volunteers is different from the weekday volunteers.\u201d\n\nVienna struggled for another question, but all that came was dry, wheezing noises from her throat. The phone hummed to fill the gap in the conversation. Finally, the volunteer awkwardly apologized and wished Vienna a good day. Vienna practically screamed at the woman to wait, but she had already hung up. She dialed the number six more times, each time pretending to be a different person and hoping to get someone new on the other end. A volunteer named Pradeep finally asked her not to call anymore.\n\n\n\n\n\n\n\n\n\nVienna put on every black thing she owned and piled her darkest makeup on her face. At 2 a.m., She made sure the crowbar was in the back seat of her car. Angry tears smeared her makeshift face camouflage, but she didn\u2019t care. They were black, anyway.\n\nShe turned her headlights off a block"} +{"output_text": " can be found on the Flickr page.\n\nThe Camp 30 site is located at the intersection of Highway 401 and Highway 400, just north of Bowmanville.\n\nThe site is open to the public, but there is a small fee to enter.\n\nThe site is located at the intersection of Highway 401 and Highway 400, just north of Bowmanville.\n\nThe site is open to the public, but there is a small fee to enter.\n\nThe site is", "input_text": " while there. Apparently this is a popular spot for such intrusions. Hopefully the curious will still come when charged a small fee for upkeep and maintenance \u2013 if the Camp is lucky enough to make it that far.\n\nTHE FUTURE OF CAMP 30\n\nFor many years, a few dedicated locals have been trying to save Camp 30 from demolition. Recently the land was purchased by a developer who was not aware of the sites history.\n\nInitially, he publicly stated that \u2018he couldn\u2019t see the point in saving the derelict, vandalized buildings\u2019. After receiving a copy of a book written in part by a local historian about Camp 30 and its incredible history, he changed his tune.\n\nStating \u2018he had no idea\u2019 of the historical value of the land, 10 hectares of important Camp 30 terrain (the section containing buildings) was sectioned off by the empathetic developer to be decided upon at a later date. He will build around Camp 30.\n\nWhat happens now is anyones guess. The Federal Government has gifted a yearly amount to preserve and restore the site, but having been there myself, clearly it is not nearly enough.\n\nIn order for that money to be matched (which still wouldn\u2019t be enough), every resident of the Durham region would need to face a 3% property tax increase.\n\nBowmanville is not a wealthy town, though it is \u2018better off\u2019 than the small surrounding towns and hamlets \u2013 it simply can\u2019t handle such an increase to preserve world history.\n\nHopefully something is worked out that is not to the detriment of locals. It would be a travesty to lose this significant and rare piece of WWII history because of a \u2018simple\u2019 (but dramatic) lack of funding.\n\nClick to view an aerial of the historic site.\n\nMany additional photos"} +{"output_text": " years ago, the IANA announced that it would begin allocating addresses to the regional internet registries (RIRs) in the autumn of 2015. The RIRs are the organisations that manage the allocation of IP addresses in the world. They are responsible for allocating addresses to the various organisations that use them, such as internet service providers, web hosting companies and internet exchanges.The RIRs are not the only organisations that have been allocated addresses. The IANA has also allocated addresses to", "input_text": " to Hurricane Electric, an internet backbone and services provider based in Fremont, California, the internet will run out of bulk IP addresses sometime next week\u2014given the rate addresses are currently being gobbled up.The Internet Assigned Numbers Authority (IANA) will then have doled out all its so-called \"slash-eight\" blocks of addresses to the five regional internet registries around the world. In turn, the registries are expected to have allocated all their remaining addresses to local network operators by October at the latest. After that, any organisation applying for new addresses will be told, sorry, none left.The issue is real and has been a long time in the making. The Economist first warned about it ten years ago (see \" Upgrading the internet \", March 22nd 2001). The problem concerns the address space of the existing version of the internet protocol (IPv4), which is only 32 bits wide. The total number of binary addresses possible with such an arrangement is therefore two raised to the power 32\u2014or roughly 4.3 billion in decimal terms. Back in the 1980s, when the internet connected just a couple of dozen research institutes in America, that seemed like a huge number. Besides, the internet was thought at the time to be just a temporary network anyway.But with the invention of the web in 1990 came an explosion in popular demand. It was soon clear that it was only a matter of time before the internet would exhaust its supply of addresses. Work on a replacement for IPv4 began in the early 1990s, with IPv6 finally being made available around 1998 (IPv5 was an experimental protocol for streaming audio and video that has since ceased to exist). By giving the new internet version an address space of 128 bits, the designers pretty well guaranteed that it would not run out of unique identifiers for decades, or even centuries, to come.Two"} +{"output_text": ", etc.\n\nSo, what\u2019s the point of all this?\n\nWell, the point is that you can now write your code in a way that is much more generic and reusable. You can now write your code in a way that is much more generic and reusable. You can now write your code in a way that is much more generic and reusable. You can now write your code in a way that is much more generic and reusable. You can now write", "input_text": ":get_value(Key, Db)} end, loop(Db). fetch(Key, Db) -> Db! {fetch, self(), Key}, receive {reply, Result} -> Result end.\n\nAs you can see, to implement that example you have to learn to spawn processes (using erlang:spawn/3 or other variants) and then send and receive messages. Then again, our database is just a process that receives messages and for each message it receives, using its internal state, builds a response and sends it back to the caller as a message. Yeah, it has a tiny bit of internal logic ( proplists:get_value(Key, Db) ) that\u2019s unique to it. But the rest is pretty generic. Even from the caller side, everything boils down to just sending messages and waiting for responses.\n\nAs you might have guessed by now, lots of people faced this same issue before and the condensed experiences helped create no other thing than OTP. When you learn how to work with OTP and its behaviours, you no longer need to spawn processes and send messages explicitly:\n\nnew(DbValues) -> gen_server:start(?MODULE, DbValues, []). init(DbValues) -> {ok, DbValues}. handle_call({fetch, Key}, _From, Db) -> {reply, proplists:get_value(Key, Db), Db}. fetch(Key, Db) -> gen_server:call(Db, {fetch, Key}). terminate(_, _) -> ok.\n\nDid you see that? You\u2019re suddenly not using spawn or! anymore. gen_server will take care of all that for you. Furthermore, gen_server will also take care of tons of other things you haven\u2019t even thought about yet, like servers not running or running in other nodes"} +{"output_text": "ups and to meet with the SEC\u2019s top brass. He\u2019s been here ever since, and he\u2019s been busy. He\u2019s been meeting with venture capitalists, angel investors, and entrepreneurs, and he\u2019s been talking to the press. He\u2019s been talking to the SEC. He\u2019s been talking to the SEC about blockchain.\n\n\u201cI\u2019m not a blockchain expert,\u201d he said. \u201cI\u2019m a blockchain evangelist.\u201d\n\nZen", "input_text": " on identifying and challenging disadvantaged students early on would contribute to leveling the playing field, fulfilling their talent and increasing their well-being. Testing all students, rather than relying on the traditional parent/teacher nomination system, will actually serve as a tool for greater fairness in placing low-income and minority students in the gifted programming they need. A small early investment in these talented students would pay off in intellectual and technological innovations, as well as GDP, benefitting us all.\n\nAs Thomas Jefferson wrote in Notes on the State of Virginia: Zenel Batagelj is telling us about Slovenia\u2019s blockchain revolution. We\u2019re sitting outdoors in a caf\u00e9 along the river that snakes through Ljubljana, taking in the last sun on a wintry day in the vibrant, picturesque capital of this tiny yet proud Central European nation. Zenel is passionate and big, and the meaty coat he wears to ward off the cold makes him even more imposing and authoritative. He\u2019s telling us about the bold technological advance that helped catapult his country into the future last summer. In just under four days, Iconomi raised a million dollars for its budding cryptocurrency trading platform, and instantly was in business, hiring engineers and making big plans. \u201cIt was an experiment,\u201d Zenel said. \u201cIt was so cool.\u201d At the time, it was the largest European initial coin offering: One million dollars in 88 hours. $10 million in five weeks, \u2026 and that was just the start.\n\nZenel is Slovenia\u2019s global connector, an unapologetic evangelist, fighting the local brain drain that deprived the country of talent and brawn during the recent international financial crisis that hit extra hard here. We met him in Silicon Valley two months ago, when he arrived among a small Slovenian delegation at SEC2SV to learn about US scale-"} +{"output_text": " why the new API is a welcome change. The new API is based on the Joda-Time library. The library is a well-known and widely used library for date and time manipulation.\n\nThe new API is a part of the Java SE platform and is available in Java 8.\n\nThe new API is a part of the Java SE platform and is available in Java 8.\n\nThe new API is a part of the Java SE platform and is available in Java 8.\n", "input_text": "asLongStream().mapToDouble( points -> points / totalPoints ).boxed().mapToLong( weigth -> ( long )( weigth * 100 ) ).mapToObj( percentage -> percentage + \"%\" ).collect( Collectors.toList() ); System.out.println( result );\n\nThe console output is just here:\n\n1 [ 19 %, 50 %, 30 %]\n\nAnd lastly, as we mentioned before, the Stream API is not only about Java collections. The typical I/O operations like reading the text file line by line is a very good candidate to benefit from stream processing. Here is a small example to confirm that.\n\n1 2 3 4 final Path path = new File( filename ).toPath(); try ( Stream< String > lines = Files.lines( path, StandardCharsets.UTF_8 ) ) { lines.onClose( () -> System.out.println( \"Done!\" ) ).forEach( System.out::println ); }\n\nThe onClose method called on the stream returns an equivalent stream with an additional close handler. Close handlers are run when the close() method is called on the stream.\n\nStream API together with Lambdas and Method References baked by Interface\u2019s Default and Static Methods is the Java 8 response to the modern paradigms in software development. For more details, please refer to official documentation.\n\n4.3. Date/Time API (JSR 310)\n\nJava 8 makes one more take on date and time management by delivering New Date-Time API (JSR 310). Date and time manipulation is being one of the worst pain points for Java developers. The standard java.util.Date followed by java.util.Calendar hasn\u2019t improved the situation at all (arguably, made it even more confusing).\n\nThat is"} +{"output_text": ", \u201cI\u2019m not racist, but I\u2019m not going to say I\u2019m not racist.\u201d It\u2019s say\u00ading, \u201cI\u2019m not racist, but I\u2019m not going to say I\u2019m not racist.\u201d It\u2019s say\u00ading, \u201cI\u2019m not racist, but I\u2019m not going to say I\u2019m not racist.\u201d It\u2019s say\u00ading, \u201cI\u2019m not racist, but I\u2019m not going to say", "input_text": "-right.\n\nWhat Char\u00adlottesville let the world see is that there is a con\u00adnec\u00adtion between racist ideas and racist action. The rea\u00adson the alt-right came to Char\u00adlottesville is that they were ter\u00adri\u00adfied to lose their Civ\u00adil War par\u00adtic\u00adi\u00adpa\u00adtion tro\u00adphy, their con\u00adfed\u00ader\u00adate mon\u00adu\u00adment to Robert E. Lee \u2014 who fought to main\u00adtain a white-suprema\u00adcist repub\u00adlic. That\u2019s why the alt-right was here. Prin\u00adci\u00adples of white suprema\u00adcy and Black sub\u00adjec\u00adtion still appeal to them.\n\nSarah: How can peo\u00adple across the coun\u00adtry and the world show sol\u00adi\u00addar\u00adi\u00adty right now?\n\nLisa: There are a vari\u00adety of ways peo\u00adple can stand up. Sup\u00adport Black Lives Mat\u00adter \u2014 not just in Char\u00adlottesville, but all around the coun\u00adtry. Get tapped into local orga\u00adni\u00adza\u00adtions. Have uncom\u00adfort\u00adable and dif\u00adfi\u00adcult con\u00adver\u00adsa\u00adtions that can open the door to greater under\u00adstand\u00ading. Be will\u00ading to be uncom\u00adfort\u00adable. Don\u2019t just go along with racism and casu\u00adal white suprema\u00adcy. That just nor\u00admal\u00adizes white supremacy.\n\nThere is a rea\u00adson white suprema\u00adcy is the air we breathe in this coun\u00adtry. White suprema\u00adcy is not just the Nazis and alt-right. It\u2019s also very casu\u00adal and sub\u00adtle. It\u2019s say\u00ading"} +{"output_text": ", the FCC tried to pass rules that would have allowed Internet providers to charge websites for faster access. The rules were struck down by a federal appeals court.\n\nBut the FCC has been trying to get the rules back on the books ever since.\n\nSo, what's the big deal?\n\nThe big deal is that the FCC is trying to regulate the Internet.\n\nThe FCC is a government agency, and the Internet is a private network.\n\nThe FCC is supposed to", "input_text": ", violates that right, they say.\n\nWhat happened this week?\n\nfederal appeals court ruled on Tuesday that the FCC doesn't have the right to enforce those rules. The court said that the government is tasked with overseeing crucial utilities like telephone service and electricity, but that the Internet isn't considered to be one of those utilities under current law.\n\nThe decision harks back to statements and decisions made by the FCC and other government agencies in the early 2000s, when molasses-slow dial-up connections were the norm and Web access wasn't nearly as common or, some would say, necessary as it is today.\n\nThe FCC has suggested it will appeal the ruling.\n\nSo what?\n\nEverybody who accesses the Internet does so through an Internet service provider. And these providers have been pushing for the ability to dole out that access to us on their own terms.\n\nWhat does that mean? For one, companies like Verizon, who sued the FCC over the rules, would be able to pick and choose who gets the best access.\n\nSo, for example, they might start charging big fees for websites to get in the \"fast lane.\" Those fees presumably would be no problem for the Web's monster moneymakers but tougher to take for the little guys.\n\nThen, all of a sudden, you're starting to get two Internets -- a quick, smooth highway for the major players and a slow, bumpy trail for everybody else.\n\nThe providers could also just blatantly play favorites. So imagine AT&T, a major provider, making traffic quicker on the websites of smartphone companies that use its mobile service and slower on the sites of phone makers who don't. We're not saying they'd do that, of course. But, theoretically, they could.\n\nHas this ever actually happened?\n\nIn 2007"} +{"output_text": " of lung cancer.\n\n\"We found that radon levels in the water were a better predictor of lung cancer than radon levels in the air,\" said Serre. \"It's a very strong correlation.\"\n\nSerre said that the correlation between radon levels in groundwater and lung cancer rates is not surprising, given that radon is a known carcinogen. But he said that the correlation between radon levels in groundwater and rates of stomach cancer is surprising.\n\n\"We don", "input_text": " often poor and have little access to medical care, said Sturchio.\n\nBut Amal Ibrahim, a cancer epidemiologist at the National Cancer Institute at Cairo University and director of Egypt's National Cancer Registry, disagrees. While there may not be many hospitals in the western desert, he said, there are more than a dozen cancer care centers along the Nile that offer treatment free of charge.\n\n\"Once there are symptoms of cancer, people seek help,\" said Ibrahim. \"And we give it to them.\"\n\nSome parts of Egypt have elevated cancer rates due to hepatitis C, said Ibrahim, but so far he has seen no evidence for other regional differences in cancer rates. Still, he said, the radiation risk from groundwater deserves more study. He would like to see data on exactly how much aquifer water people are consuming, and whether the radium is making its way into crops, dairy products and bottled water.\n\nVengosh agrees that more research would be helpful, but he sees no need to wait before taking action to protect the public, either from the Nubian aquifer or from the Disi aquifer he studied in Jordan.\n\n\"There are several studies in the U.S. and Canada that show that communities that were drinking these types of water -- much lower [radiation] than what we see in Disi -- had elevated and high prevalence of cancer,\" said Vengosh. \"We don't need to invent the wheel.\"\n\nMarc Serre, an environmental geostatistician at the University of North Carolina at Chapel Hill, conducted one such study in North Carolina, where many people rely on private wells with elevated levels of radon. Like radium, radon releases ionizing radiation as it decays, and Serre and his graduate student Kyle Messier found a correlation between radon levels in groundwater and rates"} +{"output_text": "ks\u201d of the Sanders campaign, such as the \u201cBernie or Bust\u201d movement, which encouraged supporters to vote for Sanders in the Democratic primary but not to vote for Hillary Clinton in the general election.\n\nThe Sanders campaign was a failure, but it was not a failure of horizontalism. The failure of the Sanders campaign was a failure of the Democratic Party. The Democratic Party is a party of the ruling class, and it is a party that has been in the hands of the ruling", "input_text": "zi Party\u201d 2013).\n\nSimilar developments would have unfolded during the Occupy movement in the United States if it weren\u2019t for the narrowness of the two party system. Yet, several years later, many former Occupiers campaigned for Bernie Sanders in his failed bid for the Democratic Party\u2019s presidential nomination. Certainly many who participated in Occupy before supporting Sanders were simply leftists who travel from one manifestation of left populism to the next without any allegiance to (or often direct knowledge of) horizontalism. Others, however, attempted to argue that the Sanders campaign was an extension of Occupy. This was manifest in an article titled \u201cOccupy the Party\u201d from the Not An Alternative collective that appealed to former Occupiers to treat the campaign \u201clike any street or park and occupy it\u201d (Not An Alternative 2016). In the name of pragmatic populism, this article sought to drain the term \u201coccupy\u201d of its associations with direct action, direct democracy, \u201cleaderlessness,\u201d and revolutionary politics to convince readers that it can be used as a catchy shorthand for buying into the cult of personality developing around a moderate social democrat attempting to burrow into a strati- \ufb01ed, capitalist political party. From an anarchist perspective, parks and streets are terrain of struggle that can be occupied because non-hierarchical, direct action politics can be transplanted onto them. Working within political parties, especially those like the Democratic Party, requires jettisoning those practices and incorporating oneself into the party structure. As the Irish Workers Solidarity Movement organizer Andrew Flood (2014) argued in his essay \u201cAn anarchist critique of horizontalism,\u201d \u201chorizontalism without a vision and method for revolution simply provides protest fodder behind which one government can be replaced with another.\u201d Indeed, many anti-horizontal organizers, have been perfectly willing to humor the directly democratic \u201cquir"} +{"output_text": " the centerpiece of the debate, the right-wingers are able to claim that they are \u201cdefending\u201d the Muslim community against these two \u201cextremists.\u201d\n\nThe debate is always a disaster. Choudary and Bakri are not debaters. They are not even debating. They are not debating because they are not debating. They are not debating because they are not debating. They are not debating because they are not debating. They are not", "input_text": " the free Internet.\"\n\nExchanges between LaJeunesse and Edelman also discussed how to talk about WCIT publicly. \"I'm working with DC team for a strategy of how to characterize WCIT (victory, loss or a less manichean spin) and what that, in turn, means moving forward,\" LaJeunesse wrote. The emails show LaJeunesse and Edelman set up a private meeting, as Edelman wrote that a White House delegate gathering could resolve questions over \"how to discuss the issue.\"\n\nUltimately, whatever discussions were had about how to discuss the vote, Google continued ringing alarm bells. In a statement later sent to reporters, a Google spokesperson said the vote showed that \"governments want to increase regulation and censorship of the Internet.\" Sheila Musaji of The American Muslim (TAM) has been keeping a close eye on the loons who write for Jihad Watch. The chief loon of JW, Robert Spencer, had initially been slated to debate David Wood, another Christian loon like himself. Realizing no doubt that they are on the same side of the loony equation, the debate has been scrapped. Instead, both Spencer and Wood have agreed to face off against Anjem Choudary and Omar Bakri.\n\nAs Musaji presciently noted, \u201c[b]oth Choudary and Bakri are part of the Muslim lunatic fringe.\u201d The nefarious duo are very familiar to the Muslim community of the U.K., not because they have a large following (they don\u2019t), but because they are routinely trotted out by anti-Muslim right-wingers. The set-up is always the same: a right-winger pundit will invite one of these two clowns onto their show for a \u201cdebate.\u201d By making the hated Choudary and Bakri"} +{"output_text": " \u044f \u0437\u043d\u0430\u044e.\n\n\u041b\u0435\u043a\u0441\u0443\u0441 \u0438 \u0412\u043e\u0432\u0430\u043d: \u042d\u0442\u043e \u043a\u043b\u0430\u0441\u0441\u0438\u0447\u0435\u0441\u043a\u0430\u044f \u043f\u0435\u0441\u043d\u044f, \u043a\u043e\u0442\u043e\u0440\u0443\u044e \u0432\u044b \u043c\u043e\u0436\u0435\u0442\u0435 \u0443\u0441\u043b\u044b\u0448\u0430\u0442\u044c \u0432 \u043b\u044e\u0431\u043e\u0439 \u043c\u0443\u0437\u044b\u043a\u0430\u043b\u044c\u043d\u043e\u0439 \u043a\u043b\u0443\u0431\u0435. \u042d\u0442\u043e \u043f\u0435\u0441\u043d\u044f, \u043a\u043e\u0442\u043e\u0440\u0443\u044e \u0432\u044b \u043c\u043e\u0436\u0435\u0442\u0435 \u0443\u0441\u043b\u044b\u0448\u0430\u0442\u044c \u0432 \u043b\u044e\u0431\u043e\u0439 \u043c\u0443\u0437\u044b\u043a\u0430\u043b\u044c\u043d\u043e\u0439 \u043a\u043b\u0443\u0431\u0435. \u042d\u0442\u043e \u043f\u0435\u0441\u043d\u044f,", "input_text": "\u0435\u043c \u043e\u0442\u043c\u0435\u0447\u0430\u044e \u0443\u043a\u0440\u0435\u043f\u043b\u0435\u043d\u0438\u0435 \u043e\u0442\u043d\u043e\u0448\u0435\u043d\u0438\u0439 \u043c\u0435\u0436\u0434\u0443 \u0423\u043a\u0440\u0430\u0438\u043d\u043e\u0439 \u0438 \u0421\u0428\u0410, \u0447\u0435\u043c\u0443 \u0441\u043f\u043e\u0441\u043e\u0431\u0441\u0442\u0432\u0443\u0435\u0442 \u043f\u0440\u0435\u0437\u0438\u0434\u0435\u043d\u0442 \u0422\u0440\u0430\u043c\u043f. \u041d\u0430\u0434\u0435\u044e\u0441\u044c, \u0447\u0442\u043e \u044d\u0442\u0438 \u043e\u0442\u043d\u043e\u0448\u0435\u043d\u0438\u044f \u043f\u0440\u043e\u0434\u043e\u043b\u0436\u0430\u0442 \u0440\u0430\u0437\u0432\u0438\u0432\u0430\u0442\u044c\u0441\u044f. \u0418 \u044f \u043e\u043f\u0440\u0435\u0434\u0435\u043b\u0435\u043d\u043d\u043e \u043f\u0440\u043e\u0434\u043e\u043b\u0436\u0443 \u043e\u0441\u0432\u0435\u0449\u0430\u0442\u044c \u0434\u0430\u043d\u043d\u044b\u0439 \u0432\u043e\u043f\u0440\u043e\u0441. \u041d\u0430\u0434\u0435\u044e\u0441\u044c, \u0447\u0442\u043e, \u043a\u043e\u0433\u0434\u0430 \u0432\u044b \u0432 \u0441\u043b\u0435\u0434\u0443\u044e\u0449\u0438\u0439 \u0440\u0430\u0437 \u0431\u0443\u0434\u0435\u0442\u0435 \u0432 \u0412\u0430\u0448\u0438\u043d\u0433\u0442\u043e\u043d\u0435, \u043c\u044b \u0441\u043c\u043e\u0436\u0435\u043c \u0432\u044b\u043f\u0438\u0442\u044c \u043f\u043e \u043f\u0438\u0432\u0443.\n\n\u041b\u0435\u043a\u0441\u0443\u0441 \u0438 \u0412\u043e\u0432\u0430\u043d: \u0414\u0430, \u043a\u043e\u043d\u0435\u0447\u043d\u043e. \u0414\u0443\u043c\u0430\u044e, \u044f \u0431\u0443\u0434\u0443 \u0432 \u0412\u0430\u0448\u0438\u043d\u0433\u0442\u043e\u043d\u0435 \u0447\u0435\u0440\u0435\u0437 \u0434\u0432\u0430 \u043c\u0435\u0441\u044f\u0446\u0430.\n\n\u0420\u043e\u0433\u0430\u043d: \u0425\u043e\u0440\u043e\u0448\u043e.\n\n\u041b\u0435\u043a\u0441\u0443\u0441 \u0438 \u0412\u043e\u0432\u0430\u043d: \u042f \u0441\u043e\u043e\u0431\u0449\u0443 \u0432\u0430\u043c, \u043a\u043e\u0433\u0434\u0430 \u043f\u0440\u0438\u0435\u0434\u0443, \u0438 \u043c\u044b \u0441\u043c\u043e\u0436\u0435\u043c \u043f\u043e\u043e\u0431\u0449\u0430\u0442\u044c\u0441\u044f \u043b\u0438\u0447\u043d\u043e. \u042f \u043f\u043e\u043d\u0438\u043c\u0430\u044e \u0438 \u043d\u0430\u0434\u0435\u044e\u0441\u044c, \u0447\u0442\u043e \u0432\u044b \u0441\u043e\u0433\u043b\u0430\u0441\u043d\u044b \u0441 \u0442\u0435\u043c, \u0447\u0442\u043e \u043c\u044b \u0434\u043e\u043b\u0436\u043d\u044b \u0441\u043e\u0437\u0434\u0430\u0442\u044c \u043d\u043e\u0432\u0443\u044e \u043d\u0430\u0446\u0438\u044e, \u0432 \u043a\u043e\u0442\u043e\u0440\u043e\u0439 \u043d\u0435 \u0431\u0443\u0434\u0435\u0442 \u043c\u0435\u0441\u0442\u0430 \u0440\u043e\u0441\u0441\u0438\u0439\u0441\u043a\u043e\u0439 \u0430\u0433\u0440\u0435\u0441\u0441\u0438\u0438, \u0443\u0436\u0430\u0441\u043d\u043e\u0439 \u0430\u0433\u0440\u0435\u0441\u0441\u0438\u0438, \u0432 \u043e\u0441\u043e\u0431\u0435\u043d\u043d\u043e\u0441\u0442\u0438 \u043d\u0435 \u0431\u0443\u0434\u0435\u0442 \u043c\u0435\u0441\u0442\u0430 \u0434\u043b\u044f \u0438\u0445 \u043a\u0443\u043b\u044c\u0442\u0443\u0440\u044b, \u043f\u0435\u0441\u0435\u043d, \u0432 \u0447\u0430\u0441\u0442\u043d\u043e\u0441\u0442\u0438 \u0433\u0435\u043e\u0440\u0433\u0438\u0435\u0432\u0441\u043a\u0438\u0445 \u043b\u0435\u043d\u0442. \u0412\u044b \u0436\u0435 \u0437\u043d\u0430\u0435\u0442\u0435, \u0447\u0442\u043e \u0442\u0430\u043a\u043e\u0435 \u0433\u0435\u043e\u0440\u0433\u0438\u0435\u0432\u0441\u043a\u0430\u044f \u043b\u0435\u043d\u0442\u0430?\n\n\u0420\u043e\u0433\u0430\u043d: \u0414\u0430,"} +{"output_text": " 059-9173381,\n\nbodyandsoulfestival.com\n\n\n\nThe Big Weekend 27-29\n\nThe Big Weekend is a festival of music, comedy, literature, film, art, literature, comedy, music, art, literature, comedy, music, art, literature, comedy, music, art, literature, comedy, music, art, literature, comedy, music, art, literature, comedy, music, art, literature, comedy, music", "input_text": " Coronas, Neil Young and Crazy Horse, Tom Jones and Robert Plant are already confirmed for this Cork rockfest, with more names being added \u2013 it seems \u2013 by the day. 01-7755800\n\n\n\nImmrama 12-15\n\nTravel to the heritage town of Lismore in Co Waterford, and you can travel the world. Past visitors to the festival of travel writing have included Paul Theroux, Colin Thubron, Jan Morris, Pico Iyer, Michael Palin and Redmond O\u2019Hanlon. 058-53803, lismoreimmrama.com\n\n\n\nBorris House Festival of Writing and Ideas 14-15\n\nBorris House sits at the foot of the Blackstairs Mountains in Co Carlow. Add in an eclectic mix of speakers, from the passionate to the provocative, and you have a recipe for festival success. Already signed up for 2014 are Roy Foster, Patrick McGrath, John Banville, Polly Samson and Stephen Frears. 059-9173381,\n\nfestivalofwritingandideas.com\n\n\n\nBloomsday 16\n\nDust off your straw boater and your striped jacket for the annual outbreak of Edwardian costumes, readings, performances, re-enactments and kidney breakfasts as Dublin celebrates its favourite fictional son, the antihero of Joyce\u2019s Ulysses, Leopold Bloom. 01-8788547, jamesjoyce.ie\n\n\n\nBody and Soul 20-22\n\nCelebrate the summer solstice in style at the perfectly formed boutique festival at Ballinlough Castle, Co Westmeath. Last year\u2019s programme included electronic superstars Jon Hopkins and Nicolas Jaar as well as Nick Cave and the Bad Seeds. Families are welcome; there\u2019s even a special kids\u2019 area."} +{"output_text": ",\u201d he says. \u201cI\u2019ve been to every state. I\u2019ve been to the mountains, the deserts, the prairies. I\u2019ve been to the cities, the small towns, the big cities. I\u2019ve been to the mountains, the deserts, the prairies, the cities, the small towns, the big cities. I\u2019ve been to the mountains, the deserts, the prairies, the cities, the small towns, the big cities. I", "input_text": "early every bottle is bourbon, though there is a smattering of rye and sour mash. I can see bottles from the 1800s next to obscure export bottlings of Jim Beam next to standard-issue Jack Daniel\u2019s. Seiichiro Tatsumi, an older man dressed elegantly in bartender\u2019s attire, emerges from the shadows and says hello in English. I tell him I am a friend of Nick\u2019s, and he reaches for a bottle nestled behind the register. \u201cYou want to try a 1904?\u201d he asks.\n\nHe tenderly unscrews the top and pours a shot for me and another for himself. I take a sip. It is a brand I\u2019ve never heard of, once made, Tatsumi says, especially for a hotel in Kentucky. It is highly alcoholic but silky smooth. Unlike wine or vintage port, bourbon is not supposed to change much in the bottle over time. And so I think of this as a chance to taste the past and experience, almost exactly, what drinkers were sipping a hundred years ago.\n\n\u201cI tasted my first bourbon in the basement bar of the Rihga Royal Hotel, a famous old place in Osaka,\u201d Tatsumi says. \u201cThen I spent years reading everything I could about bourbon at the American cultural center. I sent letters to Kentucky and Tennessee trying to set up visits to the distilleries. I even asked for help at the American consulate. And then I finally got to visit in 1984. I fell in love with America then. I\u2019ve been back a hundred times since. I now own a house in Lexington, and I\u2019ve even been named a colonel in Kentucky.\u201d\n\nI ask him how he found all these old bottles of bourbon. \u201cI drive across America, only on the back roads"} +{"output_text": " Polish pilots flying with the RAF. The unit was also involved in the Battle of Britain, but the main task was to escort bombers to their targets. The unit was also involved in the Battle of France, but the main task was to escort bombers to their targets.\n\nThe unit was also involved in the Battle of France, but the main task was to escort bombers to their targets. The unit was also involved in the Battle of Britain, but the main task was to escort bombers to their targets", "input_text": " its last stages, and although the favorable weather for invasion could still come, its threat was vanquished. The Luftwaffe was still a handful, but the Fighter Command could breath easily now.\n\nOn 15 December 1940, in recognition of their merits, Air Marshall Sholto-Douglas decorated Witold Urbanowicz, Zdzislaw Henneberg, Jan Zumbach and Miroslaw Feric with DFCs. The fifth one was posthumously given to Ludwik Paszkiewicz. Few days later S/Ldr Kellet, F/Lt Kent and F/Lt Forbes (both were given their own commands) left the unit, and No. 303 Squadron became purely Polish. By early 1941, the defense of the island remained the RAF Fighter Command prime task, but now it could venture into other jobs. The main German fighter Bf109 was still superior to Hurricane and Spitfire in climb rate and altitude performance, what caused the RAF planners to conceive a new fighter tactic called \"Mosquito\" (later called Rhubarb). Their objective was to harass and irritate the enemy by surprised, hedgehopping attacks against a variety of its targets on the ground. This suited very well Polish spirit and mentality to seek and engage the enemy.\n\nThe 303 flew the first of such a sorties on January 22. Six Polish Hurricanes led by F/Lt Henneberg attacked two German airfields at the estuary of the Somme River causing considerable damage. F/Lt Lapkowski brought a souvenir from that mission; 25 feet of telephone wire somehow wound round the engine of his Hurricane.\n\nToward the end of January 1941, the squadron converted itself to Spitfires: first the Mk Is then Mk II. In February, the unit joined \"Circus\" operations, with the"} +{"output_text": "\u2019s relative to what you think is mainstream. I think that the mainstream is not mainstream at all. It\u2019s a very narrow view of the world. It\u2019s a very narrow view of the economy. It\u2019s a very narrow view of the world. It\u2019s a very narrow view of the economy. It\u2019s a very narrow view of the world. It\u2019s a very narrow view of the economy. It\u2019s a very narrow view of the world. It\u2019", "input_text": " represent a particular worldview. They are usually people from the financial community and, without meaning to be biased, tend to look at the world through a specific lens. There needs to be some kind of counter to that. If you are uncomfortable with having people without any economics background then at least ask the trade unions or citizen groups or something to recommend an economist who can sit in there who can represent other interests. At the moment it is full of people either from the financial industry or purely from academic backgrounds, so monetary policy gets decided in a very particular way.\n\nYou\u2019re also critical of how mainstream economic thought dominates academia\u2026\n\nI sometimes liken the economics profession to the Catholic clergy in the Middle Ages. Unless you knew Latin you couldn\u2019t even read the Bible because the Pope refused to let the Bible be translated into the local languages. You had to either learn Latin or take their word for it. Economics has become completely inaccessible to many people so we need to change this in the way that some of the religious reforms back then tried to do. In those days, religious reformers promoted the translation of the Bible into local languages and the reading of Bible by common people. They emphasised the authority of the Bible rather than what the Vatican says is in the Bible. They, if you like, democratised the religion. Something similar to that is necessary now once again. This is not to say that we don\u2019t need academic economists; those people are necessary. But it has to be related to what\u2019s going on in the real world. Unfortunately a lot of my academic colleagues not only do not work on the real world, but are not even interested in the real world.\n\nWhat kind of heterodox or alternative streams of economic thought would you like to see in the mainstream?\n\nI don\u2019t like the term heterodox because it\u2019s relative. It"} +{"output_text": " upon to explain the behavior of economic actors only when the actors are motivated by the same values and goals. In other words, the laws of economics are not a set of universal laws that can be applied to all economic actors, but rather a set of laws that can be applied to the behavior of economic actors only when the actors are motivated by the same values and goals.\n\nRothbard's approach to monetary history does not focus on measurement but on motives. He does not attempt to measure the", "input_text": " the lobby\u2019s tutelage could do worse than to ensure constant sunlight. Given the nature of the scandal, the Department of Justice is susceptible to public pressure and thanks to Smith\u2019s indispensable work, we are privy to the means employed by the lobby to thwart similar investigations in the past. Armed with this information, it may finally be possible to uphold what Smith calls \u201cAmerica\u2019s defense line\u201d \u2014 the US laws that govern the operations of foreign agents.\n\nMuhammad Idrees Ahmad is the co-founder of Pulsemedia.org. He can be reached at m.idrees A T gmail D O T com.\n\nRelated Links Tags\n\n<\n\n\"Rothbard's approach to monetary history does not focus on measurement but on motives.\"\n\nIn this volume, Murray Rothbard has given us a comprehensive history of money and banking in the United States, from colonial times to World War II, the first to explicitly use the interpretive framework of Austrian monetary theory. But even aside from the explicitly Austrian theoretical framework undergirding the historical narrative, this book does not \"look\" or \"feel\" like standard economic histories as they have been written during the past quarter of a century, under the influence of the positivistic \"new economic history\" or \"cliometrics.\"\n\nThe focus of this latter approach to economic history, which today completely dominates this field of inquiry, is on the application of high-powered statistical methods to the analysis of quantitative economic data. What profoundly distinguishes Rothbard's approach from the prevailing approach is his insistence upon treating economic quantities and processes as unique and complex historical events. Thus, he employs the laws of economic theory in conjunction with other relevant disciplines to trace each event back to the nonquantifiable values and goals of the particular actors involved.\n\nIn Rothbard's view, economic laws can be relied"} +{"output_text": " Vieira (19).\n\nArsenal, in their first ever FA Cup final, were beaten by a goal from a player who had been on the losing side in the previous year\u2019s final, Newcastle United\u2019s French striker Nicolas Anelka.\n\nThe Gunners had taken the lead in front of a low crowd of 79,183 after a goal from Patrick Vieira, but Newcastle equalised with a goal from French striker Nicolas Anelka.\n\nThe G", "input_text": "piece put Arsenal ahead whilst David Hirst converted a Mark Bright header from close range.\n\nBefore the game, Steve Morrow collected his League Cup winner\u2019s medal, having broken his arm at the end of that final following some high-jinks from Tony Adams.\n\n20th May 1993\n\nArsenal 2 Sheffield Wednesday 1\n\nWembley Stadium\n\nAttendance: 62,267\n\nDavid Seaman, Lee Dixon, Andy Linighan, Tony Adams (captain), Nigel Winterburn, Paul Merson, Paul Davis, John Jensen, Kevin Campbell, Alan Smith, Ian Wright (David O\u2019Leary).\n\nGoalscorers: Wright (34), Linighan (119).\n\nArsenal, in the last ever FA Cup final replay, became the first team to win the domestic cup double with a very late headed winner from Andy Linighan at rain swept Wembley. The Gunners had taken the lead in front of a low crowd of 62,267 after Ian Wright ran onto a Alan Smith flick and shot past Chris Woods, only for Wednesday to equalise with a deflected Chris Waddle shot past David Seaman.\n\nDavid O\u2019Leary won his second FA Cup winner\u2019s medal fourteen years after his first, the longest gap for a player.\n\n1998 \u2013 The \u201cEasy\u201d Cup Final\n\n16th May 1998\n\nArsenal 2 Newcastle United 0\n\nWembley Stadium\n\nAttendance: 79,183\n\nDavid Seaman, Lee Dixon, Tony Adams (captain), Martin Keown, Nigel Winterburn, Ray Parlour, Patrick Vieira, Emmanuel Petit, Marc Overmars, Christopher Wreh (David Platt), Nicolas Anelka.\n\nGoalscorers: Overmars (23),"} +{"output_text": " CPS data. It focuses on an overall age group as opposed to individuals in the U.S. school system, so it can be used to study general population issues.\n\nreports the percentage of individuals in a given age range who are not in school and have not earned a high school diploma or an alternative credential, irrespective of when the credential was earned. The rate is calculated using CPS data. It focuses on an overall age group as opposed to individuals in the U.", "input_text": " students attending both public and private schools using the Current Population Survey (CPS), and state event dropout rates for public high school students using the Common Core of Data (CCD). 6 Event dropout rates can be used to track annual changes in the dropout behavior of students in the U.S. school system.\n\nestimates the percentage of high school students who left high school between the beginning of one school year and the beginning of the next without earning a high school diploma or an alternative credential (e.g., a GED). This report presents a national event dropout rate for students attending both public and private schools using the Current Population Survey (CPS), and state event dropout rates for public high school students using the Common Core of Data (CCD). Event dropout rates can be used to track annual changes in the dropout behavior of students in the U.S. school system. The status dropout rate reports the percentage of individuals in a given age range who are not in school and have not earned a high school diploma or an alternative credential. The rate is calculated using CPS data. It focuses on an overall age group as opposed to individuals in the U.S. school system, so it can be used to study general population issues.\n\nreports the percentage of individuals in a given age range who are not in school and have not earned a high school diploma or an alternative credential. The rate is calculated using CPS data. It focuses on an overall age group as opposed to individuals in the U.S. school system, so it can be used to study general population issues. The status completion rate indicates the percentage of individuals in a given age range who are not in high school and who have earned a high school diploma or an alternative credential, irrespective of when the credential was earned. 7 The rate is calculated using"} +{"output_text": " process to install them through the Visual Studio installer.\n\nThe Anaconda distribution is a popular Python distribution that includes a wide range of packages, including the Python 3.5 and 3.6 versions.\n\nThe Python 3.5 and 3.6 versions are available as separate installs, so you can install them side-by-side with other versions of Python.\n\nThe Python 3.5 and 3.6 versions are available as separate installs, so you can install", "input_text": "-by-side with another install of Visual Studio. And while the Preview install is not officially supported (not \u201cgo live\u201d), the main one remains fully supported and you can use both at the same time on the same machine.\n\n(To be clear, if you\u2019ve already installed Visual Studio 2017, this will install it again and you\u2019ll have two separate installations with separate workloads and settings, both managed through the same Visual Studio Installer. Everything that is in Visual Studio 2017 is also in Preview, so you would only need both to take advantage of paid product support offerings, which are not available for Preview. The FAQ at the product page has more details.)\n\nWhy announce it here?\n\nYou may be wondering why we are announcing Visual Studio Preview from the Python blog, rather than the main Visual Studio blog. The reason is that the first available feature in preview is the Python development workload!\n\nThis is the same Python support you\u2019ve used since Visual Studio 2010 as Python Tools for Visual Studio, but now updated and enhanced for 2017. Let\u2019s walk through some of the major changes, and we\u2019ll be posting more blogs in the coming weeks diving deeper into each one.\n\nInstallation\n\nAs Python support is now part of the Visual Studio installer, we get to take advantage of the features it provides. In the screenshot below, you can see that when the Python development workload is selected, a list of optional components become available. These are either recommended Visual Studio features or third-party tools that we think you will find useful.\n\nFirstly, you\u2019ll see a range of Python versions available, as well as the Anaconda distribution from Continuum Analytics. These are coming straight from the external sites, so you can install them yourself and get exactly the same functionality, but we\u2019re also making it a simple"} +{"output_text": " the Philippines, and today, it is eaten in every corner of the archipelago. It is the national dish of the Philippines, and the country\u2019s most popular noodle dish. It is also the most popular noodle dish in the world.\n\n\n\nPancit Mami at Masuki Mami House (Benavidez Street, Binondo)\n\nPhoto by: Ramon F Velasquez\n\nPancit Mami is a dish that is eaten throughout the", "input_text": " or enhanced with soy sauce, introduced by Chinese traders. Standard adobo is made with chicken or pork, but anything can be adobo\u2019d\u2014squid, beef, quail, shrimp, kangkong (water spinach), catfish, tanigue (mackarel), frog legs, crickets, banana flowers, bamboo shoots. Wherever you\u2019re from, and whatever your taste, the best adobo is almost always your mother\u2019s.\n\n\n\nAmy Besa, owner of Purple Yam in Brooklyn and Manila, also a self-described Manile\u00f1a, makes her adobo with apple cider vinegar because that\u2019s what was available in her city supermarkets growing up. A recipe from her book Memories of Philippine Kitchens uses baby back ribs and replaces the peppercorns with tellicherry peppers. She shared her recipe with the New York Times, where it maintains a perfect five-star rating from hundreds of homesick Filipinos.\n\n\n\nPancit Mami at Masuki Mami House (Benavidez Street, Binondo)\n\nPhoto by: Ramon F Velasquez\n\nWhen Chinese traders first arrived on the islands in the 9th century, they brought with them an array of noodle dishes from the homeland that have, over the years, come to be known collectively as pancit. The word comes from the phrase pian i sit, meaning \u201cconvenient food\u201d in the Hokkien dialect spoken in Fujian. Hawkers eventually set up panciterias, Manila\u2019s first restaurants, with Binondo as their epicenter. It is the world\u2019s oldest Chinatown, established in 1594 when the Spanish Governor granted land to immigrant Chinese merchants who had converted to Catholicism. For 400 years, Binondo was the economic center of Manila.\n\n\n\nOver the centuries, pancit spread throughout"} +{"output_text": "SK-Gel G3000SWxl column (Tosoh, Tokyo, Japan) with a flow rate of 0.5 mL/min at 25\u00b0C. The column was calibrated with a series of standard proteins (Bio-Rad, Hercules, CA). The elution profiles were monitored by absorbance at 280 nm. The column was calibrated with a series of standard proteins (Bio-Rad, Hercules, CA). The elution profiles were monitored by absorbance at 280 nm. The column was calibrated", "input_text": " nm. PBS buffer containing different concentrations of BPA were measured as the controls. All of the experiments were performed at least three times, and the lag times were calculated as we previously described [31]. Transmission electronic microscopy (TEM) The TEM was performed as previously described [32]. Briefly, 5 \u00b5l of sample was applied onto a 300-mesh Formvar-carbon coated copper grid. Excess solvent was removed carefully and stained by dropwise addition of 1% freshly prepared uranyl formate followed by air drying. Images were observed under a transmission microscope (Hitachi, Tokyo, Japan) operating at an accelerating voltage of 100 kV. Dye leakage assays POPG was dissolved in chloroform at a concentration of 10 mg/mL. Chloroform was then removed under a stream of N 2, and samples were dried under vacuum to remove residual chloroform. Multilamellar vesicles were made by mixing dry POPG films with 25 mM PBS (pH 7.4) containing 40 mM carboxyfluorescein. PD-10 columns (Sangon, Shanghai, China) were then used to remove nonencapsulated carboxyfluorescein as previously described [33]. POPG vesicles containing carboxyfluorescein were diluted in 25 mM PBS (pH 7.4) for florescence measurements. hIAPP stock solution was added to POPG vesicles at a final concentration of 1 \u00b5M immediately before measurement. The samples were excited at a wavelength of 493 nm, and the emission was detected at 518 nm. The fluorescence signal was recorded for 90 s, POPG vesicles alone were tested as the baseline and the signals of POPG vesicles treated with 0.2% (v/v) Triton X-100 (for complete membrane leakage) were used as the positive control. All measurements were repeated at least three times. Size-exclusion chromatography (SEC) The SEC analysis was performed on a T"} +{"output_text": "st of January and the 31st of March.\n\nAccomodation:\n\nAccommodation at Tukino Ski Field is $44.50 per night for adult members.\n\nAnnual membership fee:\n\n$70 for an adult\n\n$50 for a student\n\n$35 for a child\n\nAccommodation:\n\nAccommodation at Tukino Ski Field is $44.50 per night for adult members.\n\nAnn", "input_text": ".\n\nPrice of Ski Passes at Tukino Ski Field:\n\nWe scored a half price deal on Grab One, so make sure you check that out first. Otherwise, the day passes are:\n\n$65 for an adult\n\n$50 for a student\n\n$35 for a child.\n\nTIP: If you are planning to head up to the slopes for several days or stay in a lodge up the mountain, you will be much better off to become a member of one of the three Tukino clubs. This way you can enjoy the discounted prices they offer to members.\n\nTukino ski field clubs:\n\nAnnual membership fee:\n\n$70 for an adult\n\n$50 for a student\n\n$100 for 3+ members of the same family\n\nAccommodation:\n\nAccommodation for adult members is $44.50 per night during Winter. Children under 5 stay for free. This price includes meals as well. They also have options for Summer lodging.\n\nAnnual membership fee:\n\n$80 for an adult\n\n$50 for a child\n\n$200 for a family of any number of siblings\n\nNote: Apart from Tukino, Aorangi ski club also covers Whakapapa Ski Field and Turoa Ski Field. The prices listed above are only for membership to their Tukino club.\n\nAccomodation:\n\nLodging at Aorangi Ski Club is $37 per night for adult members.\n\nAnnual membership fee:\n\n$75 for an adult\n\n$40 for a child (under 18)\n\n$200 for a family\n\n$50 for a student or pensioner\n\nNote: you have to download and fill out a form and they only accept new members between the 1"} +{"output_text": "\u043e\u0437\u0434\u043d\u0435\u0435 2012 \u0433\u043e\u0434\u0430. \u0412 \u0430\u0432\u0433\u0443\u0441\u0442\u0435 2012 \u0433\u043e\u0434\u0430 \u043e\u043d \u0432\u0441\u0442\u0440\u0435\u0442\u0438\u043b\u0441\u044f \u0441 \u0440\u043e\u0441\u0441\u0438\u0439\u0441\u043a\u0438\u043c\u0438 \u0434\u0438\u043f\u043b\u043e\u043c\u0430\u0442\u0430\u043c\u0438 \u0432 \u041c\u043e\u0441\u043a\u0432\u0435 \u0438 \u043f\u043e\u043f\u0440\u043e\u0441\u0438\u043b \u043e \u043f\u043e\u043c\u043e\u0449\u0438 \u0432 \u0432\u043e\u043f\u0440\u043e\u0441\u0435 \u043e\u0440\u0433\u0430\u043d\u0438\u0437\u0430\u0446\u0438\u0438 \u043c\u0435\u0436\u0434\u0443\u043d\u0430\u0440\u043e\u0434\u043d\u043e\u0433\u043e \u043a\u043e\u043d\u0444\u043b\u0438\u043a\u0442\u0430 \u0432 \u0421\u0438\u0440\u0438\u0438.\n\n\u0412 \u0430\u0432\u0433\u0443\u0441\u0442\u0435 2013 \u0433\u043e\u0434\u0430 \u041d\u0430\u0437\u0437", "input_text": "\u0424\u043e\u043d\u0442\u0430\u043d\u043a\u0430\u00bb \u0432\u044b\u044f\u0441\u043d\u0438\u043b\u0430, \u043a\u0430\u043a\u0438\u043c \u043e\u0431\u0440\u0430\u0437\u043e\u043c \u041d\u0430\u0437\u0437\u0430\u0440\u043e \u0441\u0442\u0430\u043b \u043d\u0435\u0434\u043e\u0441\u044f\u0433\u0430\u0435\u043c \u0434\u043b\u044f \u0424\u0411\u0420.\n\n\u0412 \u0438\u044e\u043b\u0435 2012 \u0433\u043e\u0434\u0430 \u043d\u0430 \u041c\u0430\u043d\u0445\u044d\u0442\u0442\u0435\u043d\u0435 \u043f\u043e\u0436\u0435\u043d\u0438\u043b\u0438\u0441\u044c \u0433\u0440\u0430\u0436\u0434\u0430\u043d\u0438\u043d \u0421\u0428\u0410 \u0420\u0438\u043d\u0430\u043b\u0434\u043e \u041d\u0430\u0437\u0437\u0430\u0440\u043e \u0438 \u0433\u0440\u0430\u0436\u0434\u0430\u043d\u043a\u0430 \u0420\u043e\u0441\u0441\u0438\u0438, \u0443\u0440\u043e\u0436\u0435\u043d\u043a\u0430 \u0427\u0435\u0431\u043e\u043a\u0441\u0430\u0440. \u041e\u043d\u0430 \u043f\u0435\u0440\u0435\u0435\u0445\u0430\u043b\u0430 \u0432 \u0410\u043c\u0435\u0440\u0438\u043a\u0443 \u0437\u0430 \u043d\u0435\u0441\u043a\u043e\u043b\u044c\u043a\u043e \u043b\u0435\u0442 \u0434\u043e \u0437\u0430\u043c\u0443\u0436\u0435\u0441\u0442\u0432\u0430, \u0440\u0430\u0431\u043e\u0442\u0430\u043b\u0430 \u0432 \u0431\u0430\u043d\u043a\u0435.\n\n\n\n\u0420\u0438\u043d\u0430\u043b\u0434\u043e \u041d\u0430\u0437\u0437\u0430\u0440\u043e (\u0441\u043b\u0435\u0432\u0430) \u0424\u043e\u0442\u043e: \u0441\u043e\u0446\u0441\u0435\u0442\u0438\n\n\u0423 \u041d\u0430\u0437\u0437\u0430\u0440\u043e \u0431\u044b\u043b \u0432 \u0421\u0428\u0410 \u043e\u0445\u0440\u0430\u043d\u043d\u044b\u0439 \u0431\u0438\u0437\u043d\u0435\u0441. \u0415\u0433\u043e \u043a\u043e\u043c\u043f\u0430\u043d\u0438\u044f Omega Solutions International \u043f\u0440\u0435\u0434\u043b\u0430\u0433\u0430\u043b\u0430 \u0443\u0441\u043b\u0443\u0433\u0438 \u00ab\u043f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u043e\u043d\u0430\u043b\u043e\u0432 \u0432 \u043e\u0431\u043b\u0430\u0441\u0442\u0438 \u0431\u0435\u0437\u043e\u043f\u0430\u0441\u043d\u043e\u0441\u0442\u0438\u00bb \u0441 \u043d\u0430\u0432\u044b\u043a\u0430\u043c\u0438 \u043f\u0441\u0438\u0445\u043e\u043b\u043e\u0433\u0438\u0447\u0435\u0441\u043a\u0438\u0445 \u043e\u043f\u0435\u0440\u0430\u0446\u0438\u0439, \u0440\u0430\u0437\u0432\u0435\u0434\u043a\u0438, \u0431\u043e\u0440\u044c\u0431\u044b \u0441 \u0442\u0435\u0440\u0440\u043e\u0440\u0438\u0437\u043c\u043e\u043c \u0438 \u0431\u043e\u0440\u044c\u0431\u044b \u0441 \u043f\u043e\u0432\u0441\u0442\u0430\u043d\u0447\u0435\u0441\u043a\u0438\u043c\u0438 \u0444\u043e\u0440\u043c\u0438\u0440\u043e\u0432\u0430\u043d\u0438\u044f\u043c\u0438, \u043f\u0438\u0441\u0430\u043b\u0430 \u0420\u0443\u0441\u0441\u043a\u0430\u044f \u0441\u043b\u0443\u0436\u0431\u0430 \u0411\u0438-\u0411\u0438-\u0421\u0438. \u041f\u043e \u043d\u0435\u043a\u043e\u0442\u043e\u0440\u044b\u043c \u0445\u0430\u0440\u0430\u043a\u0442\u0435\u0440\u0438\u0441\u0442\u0438\u043a\u0430\u043c \u0444\u0438\u0440\u043c\u0443 \u041d\u0430\u0437\u0437\u0430\u0440\u043e \u043c\u043e\u0436\u043d\u043e \u0431\u044b\u043b\u043e \u043f\u0440\u0438\u043d\u044f\u0442\u044c \u0437\u0430 \u0447\u0430\u0441\u0442\u043d\u0443\u044e \u0432\u043e\u0435\u043d\u043d\u0443\u044e \u043a\u043e\u043c\u043f\u0430\u043d\u0438\u044e.\n\n\u0410\u043a\u0442\u0438\u0432\u043d\u043e \u043e\u0441\u0432\u0430\u0438\u0432\u0430\u0442\u044c \u0420\u043e\u0441\u0441\u0438\u044e \u0430\u043c\u0435\u0440\u0438\u043a\u0430\u043d\u0435\u0446 \u043d\u0430\u0447\u0430\u043b \u043d\u0435 \u043f"} +{"output_text": "\n\nYear School 1,000 Point Scorers 2000-01 Notre Dame 23 2001-02 Notre Dame 14 2002-03 Notre Dame 13 2003-04 Notre Dame 13 2004-05 Notre Dame 13 2005-06 Notre Dame 13 2006-07 Notre Dame 13 2007-08 Notre Dame 13 2008-09 Notre Dame 13 2009-10 Notre Dame 13 2010-11 Notre Dame 13 2011-12 Notre Dame 13 2012-13 Notre Dame 13 2013-14 Notre Dame 13 2014-15 Notre Dame 13", "input_text": " page 8 for the top 11 in career games started at Notre Dame. 135 Career games played by Rex Pflueger, who could become the most experienced player, in terms of games played, in Notre Dame history. Pat Connaughton (2011-15) currently holds the school record with 139 games played. Pflueger is tied for eighth in the country for active players in games played. The Dana Point, California, native is on pace to set the school record for games played in his final home game against Virginia Tech on March 7. 843 Sophomore guard Prentiss Hubb and Miami junior guard Chris Lykes were high school teammates at Gonzaga College High School in Washington, D.C. The duo helped lead the Eagles to the 2017 Washington Catholic Athletic Conference title while scoring 843 combined points with 112 assists. 1896 Total victories in the history of Notre Dame men\u2019s basketball, as the Irish close in on becoming the eighth program to reach 1,900 victories (Kentucky, Kansas, North Carolina, Duke, Temple, Syracuse and UCLA have all reached that plateau).\n\nPROGRAM DEVELOPMENT HIGHLIGHTED BY 1,000 POINT SCORERS\n\nNotre Dame has produced 64 1,000 point scorers throughout the history of the program with 23 of those players reaching that statistical milestone during the Mike Brey era (2000-01 \u2013 current). Of the 85 players who have suited up for Notre Dame under Mike Brey, 36 of them were recruited by the staff and exhausted their eligibility with the Irish \u2013 20 of those players have scored more than 1,000 points in a career.\n\nSince 2000-01, Notre Dame leads all ACC teams in 1,000 point scorers (23) and is second in the country over that time frame to Villanova.\n\nSchool All-Time 1,000 Point Scorers"} +{"output_text": " the Erdogan regime.\n\nThe country\u2019s human rights record is a mess. The government has been accused of using the death penalty for political reasons, and has been accused of torture and other human rights violations.\n\nThe country\u2019s human rights record is a mess. The government has been accused of using the death penalty for political reasons, and has been accused of torture and other human rights violations.\n\nThe country\u2019s human rights record is a mess. The government has been", "input_text": " aircraft is the NATO ramifications. If and when Mr. Putin responds, how will the United States and European allies react to Turkey invoking Article 5 of the NATO treaty: \u201ccollective defence means that an attack against one Ally is considered as an attack against all Allies\u201d.\n\nI\u2019m not interested in my country going to war with Russia because the Turks can\u2019t keep it in their pants. And I\u2019m certainly not interested in going to war on their behalf when they have been integral to the funding of terrorists groups like ISIS, and indeed massively useful to Al Qaeda affiliates like Al Nusra Front.\n\nIn fact we shouldn\u2019t be flippant about Turkey\u2019s slide into Islamism. President Erdogan has dragged a previously secular state into an almost all-out war with its Kurdish population, and has presided over a shift in Turkish policy towards Israel, which Mr. Erdogan accused of \u201ccrimes against humanity\u201d in 2008. Unsurprising, perhaps, given the country\u2019s noteworthy aggrandisement of the Muslim Brotherhood, and assistance with the Gaza flotillas, but still worthy of comment, and perhaps indicative of the fact that a regional, rather than ideological bloc like NATO is a little outdated itself. Over the past four U.S. presidents, NATO has approved 16 non-member states as \u201cmajor, non-NATO allies\u201d \u2013 including Israel, Australia, Jordan, Argentina, and Thailand.\n\nTurkey\u2019s cheap assistance to ISIS, literally and figuratively speaking, makes it diametrically opposed to the goals (or should-be goals) of the West towards the terrorist group.\n\nHUMAN RIGHTS\n\nAnd what a crock the country has become in the way of human rights, with Kurds in Turkey, though not without their own faults, under collective oppression and denigration at the hands of"} +{"output_text": "\u2019m not sure if you\u2019ve seen the new season of \u201cSaturday Night Live,\u201d but there\u2019s a new cast member, Kate McKinnon, who\u2019s been doing a lot of great stuff. She\u2019s been doing a lot of great stuff.\n\nI\u2019ve seen it. I\u2019m not a huge fan of hers. I think she\u2019s a great actress, but I don\u2019t think she\u2019s a great comedian. I think she\u2019s a", "input_text": " that you can repeat and repeat and repeat, whereas he comes from improv. I make him learn his lines and and he makes me relax.\n\n\nI remember when I had my first round of auditions, they called me in and said, \u201cBill Hader wants to meet you. How do you feel about coming in and improvising tomorrow?\u201d I was like, \u201cNot good!\u201d I don\u2019t come from that world, and he\u2019s the prince of comedy in this country. But I went in and we improvised for an hour as my callback. From the very beginning, there was a tone of, \u201cLet\u2019s figure this out together.\u201d It was certainly the most fun I\u2019ve ever had in an audition.\n\nThe episode last season where Sally gets dropped by her agent for not sleeping with him was so devastating \u2014 and you filmed it before #MeToo really took off.\n\n[When Hader and Berg] originally wrote that scene, she got defensive and stood up for herself and stormed out. They asked the female writers, \u201cDoes this read true?\u201d And they went, \u201cNo, she would apologize.\u201d I feel like I\u2019ve been in that situation where there\u2019s passive-aggressive sexual harassment and you\u2019re so stunned by it in the moment, your brain doesn\u2019t have time to catch up. I thought it was really smart writing.\n\nI\u2019m fascinated by the backlash [to #MeToo] because I think people are really impatient. I think it\u2019s a movement; it takes time. We\u2019re not done with it and we\u2019re not finished, and I feel as though we\u2019re not going to really understand what\u2019s changed for a good few years. I feel like it\u2019s a long overdue conversation that needed to come out on this kind of massive scale.\n\nI"} +{"output_text": "ny Liston pictured in his prime in 1960\n\nListon's reputation was so bad that he was banned from the ring in the US for a time. He was also banned from the UK for a time after he was arrested for assaulting a police officer in London in 1962.\n\n\"I was in the ring with Sonny Liston and I was scared to death,\" Muhammad Ali would later say. \"I was scared to death of him. I was scared to death of him because", "input_text": " because they use their muscle. He's the last great investment the mob make in boxing.\"\n\nSonny Liston pictured visiting Missouri State Penitentiary - where he once served time - ahead of his first fight with Muhammad Ali in 1964\n\nMobsters were attracted to Liston because of his formidable talent. After all, for a fighter to be a worthwhile investment he needs to able to win. With an eventual record of 50 wins and 4 losses with 39 knockouts, Liston's power was frightening. His left jab was one of the most concussive ever seen in boxing and his stare was one of the most menacing, burning through opponents in the minutes before the opening bell.\n\n\"Of all the men I fought in boxing, Sonny Liston was the scariest,\" Muhammad Ali would later say.\n\n\"Liston does not merely defeat his opponents,\" Jonathan Aig wrote of the fighter. \"He breaks them, shames them, haunts them, leaves them flinching from his punches in their dreams.\"\n\nListon's frightening reputation was something he exploited as he went on an impressive run of wins in the late 1950s and early 1960s. But it was also stoked by a racist press keen to portray him as the kind of brutish black man 'White America' so badly feared.\n\n\"He's arrogant, surly, mean, rude and altogether frightening,\" the famed New York Times columnist Arthur Daley wrote. \"He's the last man anyone would want to meet in a dark alley.\" Reporters often used thinly-veiled racist terms - 'gorilla' and 'beast' - in their descriptions of him. When he was set to face Floyd Patterson for the heavyweight title, President John F Kennedy went so far as to urge Patterson to find an opponent with \"better character\".\n\nSon"} +{"output_text": "'s article on the subject.\n\nI have been taking Gerovital for about a month now. I have been taking it for about a month now. I have been taking it for about a month now. I have been taking it for about a month now. I have been taking it for about a month now. I have been taking it for about a month now. I have been taking it for about a month now. I have been taking it for about a month now. I have", "input_text": " it said to use caution when a person is on antidepressants. I am currently 50 mg of Zoloft\u00ae 1x per day, and 25 mg. Nortriptyline at bedtime, would it be safe to try the above product?\n\nGerovital tablets rarely interfere with antidepressants, not when used at \u2018normal\u2019 doses- that's 1x 100mg tablet a day. Of course that is not to say that it cannot cause side effects in everyone. Should you develop any nausea or headache or other unwanted side effects discontinue its use, but there are many people who use regular antidepressants and still take Gerovital tablets without issue.\n\nI am interested in ordering the Gerovital H-3 by Dr. Ana Aslan. I know it contains Procaine HCl, but I can\u2019t find the listing for its other ingredients. 1. Please let me know the ingredients. 2. Also, what makes this such a miracle drug? 3. is this imported product from Romania or is it produced elsewhere?\n\nOur injections are from the original lab in Romania and produced to Ana Aslan's formula etc. So are the Gerovital-H3 tablets.\n\nThe GH3-Pro tablets are not currently available from Romania, these are a copy (precise to the original formula see here). They contain 100mg Procaine, 6mg Benzoic acid, 5mg Potassium metabisulphate and 500mg of disodium phosphate per tablet, the excipient in the tablet is maltodextrin.\n\nGerovital has a wide range of uses, and its primary action is believed to originate from its mild reversible MAO inhibition and the actions of Procaine and its metabolites PEAE and DEAE to help inhibit cortisol. You can read many articles written by professionals about Gerovital, in particular Mircea Dumitru"} +{"output_text": "ando por la ley de emergencia econ\u00f3mica.\n\nEl Gobierno busca un acuerdo con el FMI\n\nEl Gobierno busca un acuerdo con el Fondo Monetario Internacional (FMI) para que el organismo financiero pueda seguir financiando la deuda p\u00fablica.\n\nEl Gobierno busca un acuerdo con el Fondo Monetario Internacional (FMI) para que el organismo", "input_text": " \u201cauxiliar\u201d.\n\nEl fiscal Alberto Nisman (Mart\u00edn Rosenzveig)\n\nLas palabras de la funcionaria tuvieron as\u00ed r\u00e1pida lectura y respuesta en la Justicia, porque sugiere un error grave en t\u00e9rminos de legalidad, adem\u00e1s de funcionalidad. Y es probable que tambi\u00e9n tenga registro en el plano diplom\u00e1tico, no necesariamente p\u00fablico, seg\u00fan advierten conocedores de ese ambiente.\n\nLa se\u00f1al asoma contradictoria con los gestos centrales del Gobierno hacia Estados Unidos, adem\u00e1s del FMI y representantes de inversores que jugaron millones en papeles argentinos.\n\nAyer mismo, aunque con apuro y cierta falta de negociaci\u00f3n previa, Alberto Fern\u00e1ndez busc\u00f3 dar un primer paso hacia el armado de un pacto social, para la coyuntura de precios y salarios, y un Consejo Econ\u00f3mico Social, que deber\u00eda trabajar en temas de mediano plazo, al menos, y que incluye en las tratativas un entendimiento con Roberto Lavagna como estrella.\n\nEs tambi\u00e9n una puesta para expresar respaldo y consenso a la idea de recomponer la econom\u00eda para renegociar la deuda con plazos largos y quita. Estuvieron representantes de la UIA y de la CGT. Fue notoria la falta de las organizaciones del campo. Y qued\u00f3 un interrogante sobre la amplitud pol\u00edtica a futuro. Algunas de los objetivos planteados en esta cita deber\u00edan tener expresi\u00f3n legislativa, empez"} +{"output_text": " man several of us know. I first met him as I believe all Athenians did at WRCT. By the time I started working for the CS Engineering Lab in summer 1970, he was the \"lab manager\" or some title like that. He was still that in 1976. About 10 years or so ago at homecoming, \"Scotch & Soda\" was having a big reunion. Paul had been quite involved in Scotch & Soda and thanks to him I became briefly involved as", "input_text": " around in the CS computer room. You can see a couple PDP-10 consoles as well as a lot of DECTAPE drives in this photo as well :-) I've seen John Godfrey twice since, the first time some time in the early to mid 80's and the second and last time in the late 90's. By then he had moved away from that house he had just a couple blocks from the campus and had a nice place in a nice looking older neighborhood up on a wooded hill in Swissvale. As I was a road warrior and US Air had a hub in Pittsburgh, sometimes when I was waiting for the outgoing plane I would call him and occasionally got him but its been over 10 years now since I did that. Although I've been in Pittsburgh several times since, the limited time I had and other things on the agenda kept me from trying to contact him. 73, Chris Hausler\n\nThis photo, in addition to a second view of Carol at the left of the photo shows Paul Newbury, a man several of us know. I first met him as I believe all Athenians did at WRCT. By the time I started working for the CS Engineering Lab in summer 1970, he was the \"lab manager\" or some title like that. He was still that in 1976. About 10 years or so ago at homecoming, \"Scotch & Soda\" was having a big reunion. Paul had been quite involved in Scotch & Soda and thanks to him I became briefly involved as well as one of his helpers. Anyway I attended this reunion and Paul was listed as planning to attend but I never did find him. However, it was quite a zoo and even if he was there I might have missed him.\n\nHi All, This photo, in addition to a second view of Carol at the left of the photo shows Paul Newbury, a"} +{"output_text": "\u2019t know what it is about the Black Beer Beef Debris Po Boy, but it is one of the best Po Boys I\u2019ve ever had. The meat is tender and the sauce is rich and flavorful.\n\nThe Barqs Root Beer is a nice touch, but it\u2019s not necessary. The Po Boy is already delicious.\n\n2. The Original Killer Po Boys\n\nKiller Po Boys\n\n811 Conti or 219 Dauphine St", "input_text": ". As simple a concept as that is, chefs and cooks across the south have invented dozens of new recipes and techniques, not even to mention toppings.\n\nWhen I set my sights on a Po Boy for lunch, I\u2019m usually always going to lean towards one of these two. Although, with the different varieties out there, that choice grows increasingly more difficult each year.\n\nSo, despite the fact the similar sandwiches have always existed elsewhere, and that the idea of putting fried seafood on bread may not have originated in the Crescent City, (see above skepticism) and possible problems with the Martins\u2019 story, there is no denying that the Po Boy, in its purest sense, is a unique New Orleans food.\n\n\n\n\n\nWhere to get Po Boys in New Orleans \u2013 Plaid Shirt Yoga Pants Approved\n\n\n\nThere are so many restaurants to eat Po Boys in New Orleans, and I\u2019ve taken the task to make sure to taste different Po Boys around town. Pontchartrain Smoke favors Oyster Po Boys while Plaid Shirt Yoga Pants loves a good, messy roast beef Po Boy.\n\nKeep coming back to see other locations and varieties that we\u2019ve tasted all around town. Have a restaurant you want added to the list \u2013 make sure to reach out!\n\n1. Black Beer Beef Debris\n\nKiller Po Boys\n\n811 Conti or 219 Dauphine St\n\nThere are two Killer Po Boys locations in New Orleans \u2013 the original location inside of Erin Rose and Big Killer Po Boys on Dauphine Street.\n\nThe Big Killer Po Boys has an extensive menu, including catering to our vegetarian and vegan friends! But since I am not a vegetarian, I went straight for the Black Beer Beef Debris Po Boy with a Barqs Root Beer.\n\nI don"} +{"output_text": " a question about the applicant\u2019s religious beliefs from the interview.\n\n\u201cI was very upset,\u201d she said. \u201cI thought it was inappropriate.\u201d\n\nKobza said she was told the question was asked because the applicant was a member of a church that was not recognized by the state.\n\n\u201cI was very upset,\u201d she said. \u201cI thought it was inappropriate.\u201d\n\nKobza said she was told the question was asked because the applicant was a member of a", "input_text": " Hammes said.\n\nHe said Branstad wasn\u2019t in a hurry to name people to the commissions because the judicial branch had temporarily frozen new hires to save money.\n\n\u201cNow that the court has opened up their freeze \u2014 we are appointing to the commissions,\u201d Hammes said.\n\nHow political is the process?\n\nTim Semelroth, a Cedar Rapids defense lawyer, served on the Sixth Judicial District Nominating Commission from 2006 to 2012 \u2014 a period when it faced five openings on the local bench.\n\n\u201cWe had a lot of judges retire during that period,\u201d he said.\n\nARTICLE CONTINUES BELOW ADVERTISEMENT\n\nSemelroth did not know whether his peers were Democrats or Republicans and it didn\u2019t seem to matter, he said. The group spent most of its time discussing candidates\u2019 work ethics and temperaments.\n\n\u201cI don\u2019t think partisan politics ever came up,\u201d he said. \u201cI never asked a judicial candidate about hot-button political issues.\u201d\n\nBut commission members also have an opportunity to meet privately with judicial candidates, Semelroth said, and he does not know how often politics plays a role in one-on-one interviews.\n\nState Judicial Nominating Commission members in 2013 quizzed applicants for an Iowa Court of Appeals opening about religion and marriage, topics that shouldn\u2019t have bearing on the applicants\u2019 judicial qualifications, the Des Moines Register reported.\n\nOne commission member, Scott Bailey, asked an applicant whether she was breaking her marriage vows because her husband was working in another state, while a second commissioner, Elizabeth Doll, asked an applicant\u2019s methodology in picking a church.\n\nNeither Bailey nor Doll remain on the panel.\n\nTammy Kobza, who serves on the District 3B nominating commission, fought to remove"} +{"output_text": " however, that Marchand is on pace for a career-high in goals. He is also on pace for a career-high in points. He is also on pace for a career-high in power-play goals. He is also on pace for a career-high in power-play points. He is also on pace for a career-high in shots on goal. He is also on pace for a career-high in shots on goal. He is also on pace for a career-", "input_text": " Laine. These two have been trading blows, going back and forth for the rookie goal scoring lead (which Laine currently has). While Laine leads rookies in points, I give the edge to Matthews because he has played a significant role in leading the once-terrible Leafs to a potential playoff appearance while the Jets appear to be headed for a long summer. Matthews is also much more valuable on the power-play and just appears to be more important to his team from an outside point of view Also, the NHL loves a good story line, and what\u2019s better than a kid from Scottsdale making noise on a national level? Besides, he\u2019s overshadowing an entire NHL franchise, that has to mean something.\n\nNorris\n\nBrent Burns \u2013 San Jose Sharks\n\nBurns appears to be consensus favorite for the Norris. One point that can be used against him is similar to one that has been used on Erik Karlsson many times and it is that Burns could be considered as sort of a \u201cglorified forward\u201d as he makes most of his production in the offensive zone. While that is a valid point, Burns plays a great 200-foot game and is a very solid defensive defenseman. It is also worth mentioning that Burns has a case for the Hart which is something that I believe carries a lot of weight in this decision. There are a few defensemen that have a case because they are much better in the defensive zone than Burns, for example Victor Hedman. But in recent years point production has been given a lot of weight in considering the Norris winner. Because of this, I think Burns takes home the trophy.\n\nRichard\n\nBrad Marchand \u2013 Boston Bruins\n\nAs we speak, Sidney Crosby leads the league in goals with 40, but right behind him at 37 sits this guy. It is worth noting,"} +{"output_text": "rt. Die t\u00fcrkische Armee hat die Stadt Manbidsch bereits besetzt. Die Stadt Tal Rifat ist noch nicht besetzt. Die t\u00fcrkische Armee hat die Stadt Tal Rifat bereits besetzt. Die t\u00fcrkische Armee hat die Stadt Tal Rifat bereits besetzt. Die t\u00fcrkische Armee hat die Stadt Tal Rifat bere", "input_text": "dar. Diese Message kann ein wichtiger Vorbote von jenen Ereignissen sein, die uns in den n\u00e4chsten Tagen und Wochen in al-Bab erwarten. Die syrische Armee k\u00f6nnte eine Offensive gegen die sogenannte \u201cFreie Syrische Armee\u201d, mit der sie sich seit f\u00fcnf Jahren im Kriegszustand befindet, starten.\n\nFalls dies passieren sollte: Wird die t\u00fcrkische Armee dann einen Krieg gegen die syrische Armee riskieren? Man darf nicht vergessen, dass der t\u00fcrkische Syrien-Einmarsch auf keiner rechtlichen Grundlage fu\u00dft. Falls sie also einen Zusammensto\u00df mit dem Regime riskiert, w\u00fcrde dies einen neuen Krieg, in dem internationale und regionale M\u00e4chten involviert sind, bedeuten. Das syrische Regime und Russland haben dem t\u00fcrkischen Kampf in al-Bab bewusst zugestimmt. Ab dem jetzigen Zeitpunkt jedoch wird man versuchen, die T\u00fcrkei sowohl international als auch regional als Besatzer darzustellen, damit sie sich aus den besetzten Gebieten al-Bab, Dscharablus, Rai, Soran, Exterin etc. zur\u00fcckzieht.\n\nEine andere Option: Der Krieg gegen die Kurden\n\nSeit der Invasion von Dscharablus haben die an die T\u00fcrkei gebundenen Banden die St\u00e4dte Manbidsch und Tal Rifat zu ihren \u201cn\u00e4chsten Zielen\u201d erkl\u00e4"} +{"output_text": " for my conduct,\u201d Hodge said in a statement. \u201cI am deeply ashamed of my actions and I apologize to my family, my friends, my colleagues, my company and the public.\u201d Hodge\u2019s admission came as the first of the parents to plead guilty in the scandal, which has ensnared dozens of wealthy parents and their children. The admissions scandal has already led to the resignation of the former head of the University of Southern California\u2019s athletic department, the firing of the head of the admissions", "input_text": " which a lot of people are going to have on hand already. That means for a very small amount of money, you can have a very nice computer running one of the most popular Linux distributions. Some people (including me) might argue that there are really not many (or any) significant advantages of Ubuntu MATE over Raspbian, but even I can't deny that MATE looks more polished, and if you are accustomed to Ubuntu in general or MATE in particular, then this distribution is the way to go.\n\nRead more on the Raspberry Pi 3 More parents plead guilty in US college admissions scandal\n\nShare this article: Share Tweet Share Share Share Email Share\n\nBoston \u2014 One father conspired to pay bribes to get two children admitted to the University of Southern California \u2014 one as a recruit in soccer, the other in football. A couple plotted to cheat on college entrance exams for their two daughters. A mother worried that her daughter might figure out that she was trying to get her a fake ACT score, saying, on a call that turned out to be recorded by authorities, \u201cShe already thinks I\u2019m up to, like, no good.\u201d Four parents \u2014 including the former head of one of the world\u2019s biggest asset managers and an heir to a fortune created by microwaveable snacks \u2014 pleaded guilty Monday in the nation\u2019s largest college admissions prosecution. With trials drawing closer and prosecutors warning of new charges, the four were part of a new wave of parents pleading guilty to using lies and bribery to secure their children\u2019s admission to elite colleges. Among them was Douglas Hodge, a former chief executive of Pimco and one of the most prominent business executives caught up in the scandal. He admitted that he conspired to pay more than $500 000 in bribes to get two of his children admitted to USC as athletic recruits. \u201cI accept full and complete responsibility"} +{"output_text": "\nThe m\u0430nu\u0430l ju\u0456\u0441\u0435r h\u0430\u0455 a \u0455\u0456z\u0435-\u043eut dr\u0456\u0440\u0441u\u0440, a \u0455w\u0456ng-\u043eut dr\u0456\u0440\u0441u\u0440, \u0430nd a n\u043en-slip b\u0430\u0455\u0435.\n\nThe m\u0430nu\u0430l ju\u0456\u0441\u0435r h\u0430\u0455 a \u0455\u0456z\u0435-\u043eut dr\u0456\u0440\u0441u\u0440, a \u0455w", "input_text": "t\u0430nt\u0456\u0430l \u0430m\u043eunt \u043ef ju\u0456\u0441\u0435.\n\nClick to Continue Reading\u2026..\n\n2. H\u0430m\u0456lt\u043en B\u0435\u0430\u0441h 932 Commercial C\u0456tru\u0455 Manual Juicer\n\nThis i\u0455 \u043en\u0435 \u043ef th\u0435 b\u0435\u0455t m\u0430nu\u0430l ju\u0456\u0441\u0435r \u043en th\u0435 market. Unl\u0456k\u0435 m\u0430n\u0443 \u043eth\u0435r m\u0430nu\u0430l ju\u0456\u0441\u0435r m\u0430\u0441h\u0456n\u0435\u0455 ju\u0456\u0441\u0435r\u0455, on the m\u0430rk\u0435t th\u0430t h\u0430v\u0435 a specific d\u0435\u0455\u0456gn th\u0430t \u0430ll\u043ew\u0455 f\u043er a unique press and m\u0430n\u0435uv\u0435r fun\u0441ti\u043en dur\u0456ng ju\u0456\u0441\u0456ng, th\u0435 h\u0430milt\u043en beach design h\u0430\u0455 a t\u043et\u0430ll\u0443 \u0430w\u0435\u0455\u043em\u0435 d\u0435\u0455\u0456gn.\n\nIt\u2019\u0455 a \u0441\u043emm\u0435r\u0441\u0456\u0430l gr\u0430d\u0435, m\u0430nu\u0430l ju\u0456\u0441\u0435r w\u0456th rack-and-pinion g\u0435\u0430r\u0456ng.\n\nIt\u2019\u0455 m\u0430d\u0435 \u043ef heavy-duty m\u0435t\u0430l \u0430nd f\u0456n\u0456\u0455h\u0435d w\u0456th h\u0456gh-\u051bu\u0430l\u0456t\u0443 \u0435n\u0430m\u0435l \u0430nd acid-resistant \u0441hr\u043em\u0435.\n\nThe bu\u0456ld i\u0455 pretty simple: there\u2019s a r\u0435m\u043ev\u0430bl\u0435 filter, \u0455w\u0456ng-\u043eut dr\u0456\u0440\u0441u\u0440 \u0430nd th\u0435 n\u043en-slip b\u0430\u0455\u0435.\n"} +{"output_text": "\u2019s based on the exploitation of workers. It would mean getting rid of the system of wage labor, and replacing it with a system of free, universal, and equal access to the means of production. It would mean getting rid of the system of private property, and replacing it with a system of common property. It would mean getting rid of the system of exploitation, and replacing it with a system of cooperation. It would mean getting rid of the system of competition, and replacing it with a system", "input_text": " fought with cops on the streets to have their unions upheld. But unions got comfortable after capitalists gave them a little bit here and there in the form of bargaining rights, healthcare and some welfare after the Second World War. The unions became satisfied, and ended up losing many of the arts of organizing and mobilizing\u2014often, in fact, forgetting to organize and mobilize at all. What\u2019s more, those benefits that workers won were often produced on the backs of Indigenous people and people of other countries, who faced imperialist exploitation. Meanwhile, the capitalist system itself keeps suffering deep crises, where capitalists can\u2019t keep up their profits, so they look for new ways to make profits.\n\nSo then beginning with crises in the 1970s, and with the successes of anti-imperialist movements in much of the Third World, the capitalists said it\u2019s time to cut back and to get rid of these welfare initiatives to maintain general profits. We see the effects of this trend now, with a gutted economy that isn\u2019t giving people jobs it once used to. The problem here is that capitalists, bosses, try to make more profits. Workers try to make more wages and to get a \u201csocial wage\u201d like healthcare and education. Bosses get profits because they own the capital, which they own because previous generations of workers produced it for them. This is class struggle. Unless workers understand that they are engaged in a continuous class struggle\u2014where there can be temporary truces, but no permanent settlements\u2014they\u2019re not going to be able to move on to better forms of society. Capitalists will always strike back, because they own the productive powers of society\u2014everything we depend on.\n\nA permanent settlement would mean getting rid of the bosses, and with them, to get rid of a system built on profits and that\u2019s so prone to crises because it"} +{"output_text": " the deal to the Commons was a \u2018mistake\u2019. May\u2019s deal is not a good deal. It is not a deal that will deliver the Brexit that the country voted for. It is not a deal that will deliver the economic benefits that the country voted for. It is not a deal that will deliver the political benefits that the country voted for. It is not a deal that will deliver the benefits that the country voted for. It is not a deal that will deliver the benefits that the country", "input_text": " and that her plan will see payments to Brussels falling from \u00a310 billion a year to well under \u00a31 billion.\n\nBut a Brexit that is concerned almost solely with these subjects does nothing for those who want to break free from the protectionism of the customs union. It doesn\u2019t allow the UK to move away from EU environmental rules that are too often not based on proper scientific evidence. It means the UK must accept all of the EU\u2019s existing social legislation \u2014 so this deal doesn\u2019t even restore us to the position we were in after the Maastricht treaty, when the country at least had an opt-out from the social chapter. We shall have to wait to see what the final EU/UK trade deal stipulates. But the common rulebook approach that May advocates would also see the EU\u2019s precautionary principle continue to hinder scientific research and technological innovation.\n\nThe third problem that Theresa May will leave behind her is a lack of trust. In the aftermath of the referendum, she won the leadership because she was acceptable to all sides. During the referendum, May had positioned herself rather cleverly (or, her critics would say, cynically) as a reluctant Remainer. She never sounded like a great enthusiast for the EU, gave a speech setting out her position that contained many caveats, and in the last ten days of the campaign said that she still wanted more on free movement. This positioning meant that she was acceptable to all sides when the leadership campaign came. Brexiteers believed that she really did mean that \u2018Brexit means Brexit\u2019, while Remainers thought she understood their concerns.\n\nBut May\u2019s premiership has eroded that trust. Brexiteers argue, with justification, that May has approached the whole thing as a damage limitation exercise rather than as an opportunity to be seized. As one cabinet minister acknowledges, the decision to put"} +{"output_text": " in the past few years. What is it about this series that makes it different?\n\nOE: I think that the series is a way of working on a theme, and that the theme is the relationship between the human and the natural. I am interested in the way that we are always in a relationship with nature, and that this relationship is not always a harmonious one.\n\nEWA: You\u2019ve been working on this series for a while now. What is it about this", "input_text": " it!\n\nEWA: The Turner paintings used for the color experiments are inspired by natural phenomena, a relationship found in your work as well. There seem to be a few layers at work here: Turner translated his experiences of nature, and you, in turn, translate Turner\u2019s vision in your paintings. How is the role of nature different for you in this series as opposed to in past work?\n\nOE: That\u2019s an interesting question, which I think touches on something very important about how we understand nature. I do not believe that you can truly separate nature and culture, because there is in fact no outside. As animals, we humans are always part of nature. The science journalist Lone Frank said in a conversation I had with her recently for my Riverbed catalogue that \u201cculture is something that arises from the human brain\u2019s way of functioning, from our way of being animals.\u201d We think we know what nature is, but it has more to do with what we exclude, what we say is not nature.\n\nI suppose that your point is that my paintings are a kind of mediation on Turner\u2019s mediation on an experience of nature. But perhaps we should look instead at the pigments as part of nature, and think of painting as a material phenomenon that can be perceived the way we sense a rainbow, a river, or a volcano.\n\nEWA: David Hockney once said there is a \u201crelationship between the way we depict space and behave in it.\u201d What do you make of this idea within the context of your work in relation to Turner\u2019s?\n\nOE: It\u2019s a fascinating proposal, but I am generally not as interested in what an artwork depicts as in what it produces, its performative aspect, the way it induces you to act and behave in space.\n\nEWA: You\u2019ve worked in series before, particularly"} +{"output_text": " a bit of a surprise.\n\nYeah, it was a bit of a surprise. I think it was a bit of a surprise to everyone. I think it was a bit of a surprise to the guys at 2000AD, too. I think they were a bit taken aback by it. I think they were a bit taken aback by the fact that we were doing it. I think they were a bit taken aback by the fact that we were doing it in the first place.", "input_text": ", really, but we kept saying we would have to get together on something, so we had that possibility in the back of our minds for a long time. Ben comes from a soundtrack background: he understands working to picture incredibly well. I come from a completely different place when it comes to music, having worked on much more traditional, song-based stuff, and eventually finding my way towards working on much more experimental arrangements, but never really working with pictures, with film. But I had always thought that would be a fun thing to do. So we started talking about things... strangely enough I was working on the Banksy film [Exit Through The Gift Shop], and I was talking to him over a pint after football about working to picture. I was the music supervisor on that project: I made some tunes to go in the film, rather than actually writing a score.\n\nBen and I had agreed that we had to work together, and then we were approached with the opportunity to do a film score as a project, with the idea of it being synth-based. We began working with some traditional elements too, like strings and stuff like that, but really messing around with them, time-stretching them really far, to create a different kind of vibe. So, the project for the film didn't really work out, but we just kind of kept on going! Because of the type of music that it was, and because I've always been a 2000AD fan, it just made sense to connect it with 2000AD and Mega-City One. So, we went to see the guys at 2000AD, and they were up for the concept and supported it.\n\nSo, just to be clear, there's no connection between DROKK and the forthcoming movie Dredd, with Karl Urban?\n\nNo, there's no connection now, no.\n\nThat was"} +{"output_text": ", ha voluto sottolineare che \u00abl\u2019Italia \u00e8 un partner importante per la Francia e per il mondo\u00bb. \u00abLa Francia \u00e8 pronta a sostenere l\u2019Italia nella sua missione di mediazione in Libia\u00bb, ha aggiunto il portavoce del ministero degli Esteri, Benjamin Griveaux, che ha anche sottolineato che \u00abl\u2019Italia \u00e8 un partner importante per la Francia e per", "input_text": "i dei trafficanti, favoriti dall\u2019instabilit\u00e0 sull\u2019altra sponda del Mediterraneo. L\u2019ultima notizia \u00e8 che, approfittando del caos, quasi 2.000 migranti africani sarebbero fuggiti da un centro di detenzione vicino all\u2019aeroporto di Tripoli.\n\nLa conferenza di novembre\n\nA Palazzo Chigi si sono limati anche i dettagli sulla conferenza sulla Libia in programma a novembre, probabilmente in Sicilia, con la quale l\u2019Italia punta a confermare il suo ruolo di mediazione nel Paese. Moavero continua a tessere la sua tela con una serie di contatti telefonici, ultimo in ordine di tempo quello con lo stesso Serraj, proprio nel giorno in cui pare essersi sbloccata la situazione nella capitale. Prima ancora di discutere di elezioni - ha anticipato il ministro - il tema prioritario dell\u2019appuntamento di novembre \u00absar\u00e0 la sicurezza, pre-condizione per lo svolgimento del voto\u00bb. Un tema su cui Italia e Francia hanno finora registrato una distanza, con l\u2019Eliseo che ha continuato a insistere perch\u00e9 i libici vadano alle urne entro dicembre.\n\nDa Parigi, per\u00f2, \u00e8 arrivata stasera una nota conciliante del ministero degli Esteri, che dopo le critiche contro la Francia mosse in primis dal vicepremier Matteo Salvini"} +{"output_text": " new design elements which are a little different to the W11.\n\nThe front of the phone is dominated by a large 5.5-inch Full HD display with a resolution of 1920 x 1080. The display is protected by Gorilla Glass 3 and is surrounded by a metal frame which is a little thicker than the W11.\n\nThe back of the phone is dominated by a large 5.5-inch Full HD display with a resolution of 1920 x 1080. The display is protected by", "input_text": " the U.S. operation to remove the Islamic State from Raqqa: Reuters\n\nThe peace agreement between Colombia and Farc rebels is set to be signed today: The Guardian\n\n\u2014 The Air Force is expected to be 1,000 pilots short in 2017: San Antonio Express-News\n\n\u2014 Celebrity chef Robert Irvine is partnering with Sodexo, which provides food to dozens of military installations, to try to improve the chow: Military Times\n\nFollow us on Twitter Dave Brown @dave_brown24\n\n\n\nBryan Bender @bryandbender\n\n\n\nConnor O'Brien @connorobriennh\n\n\n\nJacqueline Feldscher @jacqklimas\n\n\n\nLara Seligman @laraseligman With so many Chinese Android phones now available will THL\u2019s flagship T100S offer enough to win over from its rivals? Keep reading for the full THL T100S review.\n\nSince Mediatek launched their 8-core MT6592 processor the Chinese Android landscape has boiled over with new octa-core phones with their own unique styling and features. This brings us to the THL T100S which we have been testing over the past few weeks.\n\nThe THL T100S is one of the first 8-core Meditek phones to go on sale in China and one of only a few which can be ordered internationally, but it has plenty of rivals from the likes of Zopo, TCL, GooPhone, Huawei and others.\n\nTHL T100S Specifications\n\nTHL T100S Design\n\n[komper pid=139 compareform=no]\n\nThe THL T100S has been designed from the ground up to replace the THL W11 aka Monkey King. The new phone features a few"} +{"output_text": "os sin nada\", dice Romero.\n\nEl Ayuntamiento de Madrid, que ha sido el principal responsable de la situaci\u00f3n, ha asegurado que la indemnizaci\u00f3n es \"justa y adecuada\" y que \"no se ha producido ning\u00fan robo de propiedad\".\n\n\"No es justo que se nos pague por un metro cuadrado de nuestro terreno, que es nuestro patrimonio, y", "input_text": " retrasando a\u00f1o tras a\u00f1o en un tortuoso proceso burocr\u00e1tico que dura hasta hoy.\n\n\"En 2005, la Comunidad de Madrid se retir\u00f3 y el Ayuntamiento, viendo lo que se le ven\u00eda encima, vendi\u00f3 todo el terreno a Dragados-ACS para que ellos se encargaran de todo, incluida la negociaci\u00f3n de las expropiaciones y sus indemnizaciones, algo que es la primera vez que se ha hecho en Madrid. Nunca hab\u00eda visto algo as\u00ed en toda mi vida y llevo muchos a\u00f1os en esto. Y es tan desesperante que empezaron siendo 320 familias y el censo ha bajado a 200\", dice Luis Romero, arquitecto y t\u00e9cnico en urbanismo, elegido hace meses como presidente de la Asociaci\u00f3n de Afectados para intentar buscar una soluci\u00f3n.\n\nLa constructora empez\u00f3 a desplegar las gr\u00faas, a ejecutar las expropiaciones y a realojar a algunas familias. Pero lo hizo manejando unas cifras que los afectados recitan de memoria, mascando la indignaci\u00f3n. Acogi\u00e9ndose a la legislaci\u00f3n vigente y a sendas sentencias judiciales, se indemniza a 868 euros por metro cuadrado y se ofrece a cambio vivienda de protecci\u00f3n oficial a unos 1.700 euros el m2. \"Es de sentido com\u00fan que el trato es absolutamente injusto y un robo de nuestro patrimonio. Aqu\u00ed ya no hablamos de ganar nada en el proceso, sino de no quedarn"} +{"output_text": ", \"Well, I'm not going to be like Hattie McDaniel.\" But I'm not Hattie McDaniel. I'm Mo'Nique. I'm a black woman. I'm a woman. I'm a mother. I'm a wife. I'm a daughter. I'm a sister. I'm a friend. I'm a daughter-in-law. I'm a mother-in-law. I'm a grandmother. I'm a great-grandmother", "input_text": " because the board members were not properly appointed and that the board did not have enough members to do business without the improperly appointed officials.\n\nNoel Canning prevailed in the U.S. Circuit Court of Appeals for the District of Columbia, and two other appeals courts also had ruled against recess appointments. Mo'Nique: I Was \"Blackballed\" After Winning My Oscar\n\nThe 2010 best supporting actress winner for 'Precious' \u2014 who refused to campaign for her award \u2014 says she was told by her director Lee Daniels that the perception is she's \"difficult\" and \"tacky,\" and she's lost out on several roles as a result.\n\nA version of this story first appeared in the Feb. 27 issue of The Hollywood Reporter magazine.\n\nAt the 2010 Academy Awards, Mo'Nique wore white gardenias in her hair \u2014 just as Hattie McDaniel had in 1940 when she became the first African-American actress to win an Oscar. The Precious star later thanked McDaniel in her best supporting actress acceptance speech \"for enduring all that she had to, so that I would not have to.\" As The Hollywood Reporter recognizes the 75th anniversary of McDaniel's historic win, we speak at length with Mo'Nique about her debt to her movie-star idol, her memories of her own Oscar night and the dramatic turn her career has taken in the five years since. As director Lee Daniels put it to her in a recent phone call, \"Mo'Nique, you've been blackballed.\"\n\nHow do you respond to those who criticize Hattie McDaniel for only taking maid roles?\n\nIf they knew who this woman really was, they would say, \"Let me shut my mouth.\" If they really understood the fights behind the scenes, the conversations we'll never have the opportunity to hear. And then you say"} +{"output_text": ", a spare tire, and a jack, and a price tag that\u2019s not too far from the average SUV.\n\nLada Niva\u2019s interior is a bit more spacious than you\u2019d expect from a compact SUV, but it\u2019s not exactly luxurious. The seats are comfortable, but they\u2019re not as supportive as you\u2019d expect from a car of this class. The steering wheel is a bit too small for my liking, but it\u2019s not too bad.", "input_text": " likes, but no one can dispute the tiny Soviet SUV\u2019s off-road capability. So, what are exactly Niva\u2019s shortcomings compared to other classic SUVs here? To begin with, it was never imported into the US, although it was both produced and marketed around the world. Then, Niva is far from body-on-frame SUVs of old. In fact, if you\u2019re looking for a culprit responsible for modern-day crossovers, look no further. Lada Niva is the world\u2019s first mass-produced unibody SUV with independent coil spring front suspension. A definition of most modern crossovers indeed.\n\nIn true Soviet fashion, Lada Niva only offered two engine choices over the span of 40 years (and counting). Furthermore, it didn\u2019t exactly evolve in terms of design either. Some would say it was perfect from the beginning, but we know Eastern Bloc car philosophy better than that. The above mentioned dynamic duo of 4-cylinders consisted of a 1.6L and 1.7L. The former was a 72-horsepower carbureted one, while the latter arrived in 1993 and brought GM\u2019s single-point fuel injection with it. Some markets offered Peugeot\u2019s 1.9L diesel as well, but they\u2019re much rarer than original Russian petrol engines.\n\nAs spartan as anything car-related coming out of Russia, Lada Niva can traverse any given terrain. There\u2019s truly no place one of the best off-road SUVs can\u2019t tame. 4WD and rugged nature go a long way, but Niva\u2019s compact size and low weight complement the former couple of attributes rather well. The complete package is a nicely rounded amalgam of immense off-road capability, fine maneuverability, convenience features list consisting of an ashtray"} +{"output_text": "feat Verak Lith while not in a group. (X/50) Combat Boss Kills 5\n\nSuperiority Complex Solo Kills Solo Yes Defeat Verak Lith while not in a group. (X/5) Combat Boss Kills 5\n\nSuperiority Complex Solo Kills Solo II Yes Defeat Verak Lith while not in a group. (X/10) Combat Boss Kills 5\n\nSuperiority Complex Solo Kills Solo III Yes Defeat Ver", "input_text": "moning skill. Skills No 5\n\nSummoning 90 Yes Reach level 90 in the Summoning skill. Skills No 5\n\nSummoning 92 (Halfway There) Yes Reach level 92 in the Summoning skill. Skills No 5\n\nSummoning 99 Yes Reach level 99 in the Summoning skill. Skills No 5\n\nSun Shade Yes Plant the dominion marker in the desert, with all its achievements complete, including killing Sunfreet. Exploration Desert 25\n\nSunshine Through The Rain Yes Defeat Hard Mode Commander Zilyana whilst under the effects of her cloud attack and the Sunshine ability. Combat Solo PvM 40\n\nSuperior Successor Yes Obtained a collection of unique drops from the Magister. Miscellaneous Feats 0\n\nSuperiority Complex I Yes Defeat Verak Lith in a group. (X/5) Combat Boss Kills 5\n\nSuperiority Complex II Yes Defeat Verak Lith in a group. (X/10) Combat Boss Kills 5\n\nSuperiority Complex III Yes Defeat Verak Lith in a group. (X/25) Combat Boss Kills 5\n\nSuperiority Complex IV Yes Defeat Verak Lith in a group. (X/50) Combat Boss Kills 5\n\nSuperiority Complex Solo Kills I Yes Defeat Verak Lith while not in a group. (X/5) Combat Boss Kills 5\n\nSuperiority Complex Solo Kills II Yes Defeat Verak Lith while not in a group. (X/10) Combat Boss Kills 5\n\nSuperiority Complex Solo Kills III Yes Defeat Verak Lith while not in a group. (X/25) Combat Boss Kills 5\n\nSuperiority Complex Solo Kills IV Yes De"} +{"output_text": "ament, you shall go up to the sky, you shall ascend to those who are above the earth.\"\n\n- The Book of the Dead, The Egyptian Book of the Dead, The Book of the Dead, The Egyptian Book of the Dead, The Book of the Dead, The Egyptian Book of the Dead, The Book of the Dead, The Egyptian Book of the Dead, The Book of the Dead, The Egyptian Book of the Dead, The Book of the Dead, The Egyptian Book of", "input_text": "ESSAS Resumption of The Experiences on the Energy Enhancement Course In the meditation that followed I continued my snake like progress upwards and asked the white light for help. I had a lump in my throat chakra so I sent that to the fire.\n\n\n\nI thought I would also ask if I would see again a person I really wanted to see. The image came back of my first love and I acknowledged the grief I felt at his loss. I poured bucketfuls of my grief down into the fire below.\n\n\n\nWhen trying out EE Level 1 Stage 4, I had problems when constructing the visualisation. At first I had an image of a golden tower but only slightly up from the base was built, I could not seem to go any further. Later I tried building it by one brick at a time, it seemed a long process.\n\n\n\nDevi Dhyani identified a blockage above the crown chakra, which she removed using Energy Enhancement Level Three Techniques. Things improved but the white light was not strong until near the end when a thin rainbow of light fell down from on high, it then turned to a long waterfall.\n\n\n\nI had the feeling of pushing upwards and a desire to fly...... THE Macrocosmic Orbit, The Energy Enhancement Supra Galactic Orbit,, The Energy Enhancement Supra Galactic Orbit, MAXIMUM POWER UP!! OVERCOME THE SHOCK AND FEAR OF THE DEATH PROCESS SOUL CONTACT!\n\nPractise this Initiation every day so that when you die you will not be so shocked by the Death Process.\n\n\" Your Soul is bound for the sky, your corpse is beneath the ground... You shall go up to the sky.... You shall ascend to those who are above the earth.... You shall ascend to the sky, you shall traverse the firm"} +{"output_text": "A \u2013 100% vegan!\n\nZoeva \u2013 100% vegan!\n\nZoeva \u2013 100% vegan!\n\nZoeva \u2013 100% vegan!\n\nZoeva \u2013 100% vegan!\n\nZoeva \u2013 100% vegan!\n\nZoeva \u2013 100% vegan!\n\nZoeva \u2013 100% vegan!\n\nZoeva \u2013 100% vegan!\n\nZoeva \u2013 100% vegan!\n\nZo", "input_text": " Beauty \u2013 Black-owned brand\n\nURB Apothecary\n\nUrban Decay \u2013 It should be noted that they are cruelty-free, but their parent company is not.\n\nUrsa Major\n\nBack to top\n\nV\n\nVanity Planet\n\nVapour Organic Beauty\n\nVelvet 59 \u2013 100% vegan!\n\nVerdant Botanicals \u2013 100% vegan!\n\nVintage Cosmetic Company, The\n\nVIRGINIC \u2013 100% vegan!\n\nVitacare\n\nVOLANTE Skincare\n\nVolition Beauty\n\nVoilaVe\n\nBack to top\n\nW\n\nW.S. Badger\n\nWander Beauty\n\nwashbeautyco. by MaskerAide \u2013 100% vegan!\n\nWe Love Eyes \u2013 100% vegan!\n\nWeleda\n\nWell Scent \u2013 100% vegan!\n\nWellness Prioritized \u2013 100% vegan!\n\nWen by Chaz Dean\n\nWhitening Lightning\n\nWhole Foods Brand/365\n\nWildMint Cosmetics \u2013 100% vegan!\n\nWrenn \u2013 100% vegan!\n\nBack to top\n\nX\n\nXenca\n\nXyrena \u2013 100% vegan!\n\nBack to top\n\nY\n\nYaby Cosmetics\n\nYarok\n\nYASOU natural skin care\n\nYes to Carrots\n\nYLLO Scrub \u2013 100% vegan!\n\nYorba Organics\n\nYoungblood Mineral Cosmetics\n\nYouth to the People \u2013 100% vegan!\n\nYu-Be\n\nBack to top\n\nZ\n\nZ Natural Life \u2013 100% vegan!\n\nZabana Essentials \u2013 100% vegan!\n\nZAK"} +{"output_text": " and Dostie could search the property, they had to get permission from the property owner, who was out of town.\n\n\"I'm not going to give you permission to search the property,\" said the property owner, who asked not to be identified. \"I don't want to be involved in this.\"\n\nDostie says he told the property owner that he was going to search the property anyway.\n\n\"I said, 'I'm going to search the property anyway", "input_text": " (CNN) -- The Charles Manson murder spree of 1969 ended in a remote Death Valley, California, cabin called Barker Ranch. It's where Manson and members of his cult \"family\" hid after the seven murders, dubbed the \"Helter Skelter\" killings that terrified the country.\n\nNow, thanks to a small-town detective and his cadaver dog, Manson's hideout might be searched for more murder victims.\n\nAbout a year ago, Sgt. Paul Dostie of the Mammoth Lakes Police Department decided to test his dog, Buster, at the Barker Ranch. He heard rumors that Manson and his followers had killed more people and buried them behind their hideout.\n\nAfter several visits, Buster, who was trained to find human remains, found five possible graves, Dostie says.\n\nA few weeks ago, a CNN crew went with Dostie, Buster and gold prospector Emmett Harder to Barker Ranch. Harder knew Manson and his top lieutenant, Charles \"Tex\" Watson, and spent time with the Manson family in 1969. Harder says at that time he had no idea some of Manson's group of more than 30 men, women and children had just gone on a killing spree. Watch a report from the ranch \u00bb\n\nGetting to the Barker Ranch requires a four-wheel drive to manage the steep, rocky terrain of the Golar Wash -- a narrow passage separating the High Desert Mountains from the arid desert valley below. As we bounced around on the drive in, it was hard to imagine how Manson and his cult got a school bus up the same road 40 years ago. See photos inside the Manson compound \u00bb\n\nWe finally arrived at the Barker Ranch about an hour after leaving the ghost town of Ballarat.\n\nBefore Buster, the dog,"} +{"output_text": " They\u2019re a part of the history of wrestling, and they\u2019re a part of the history of wrestling fans. They\u2019re a part of the history of wrestling fans who grew up watching the WWE. And they\u2019re a part of the history of wrestling fans who grew up watching the WWE and who are now wrestling fans themselves.\n\nDave Millican is a wrestling historian. He\u2019s a fan of wrestling history. He\u2019s a fan of wrestling history who has been collecting wrestling", "input_text": " copy, but it\u2019s still a genuine Reggie Parks belt, nonetheless. He points out the age, the wear-and-tear on the belt, and a seam where Reggie doesn\u2019t seal the leather at the edge of his championship belts, a difference in the style of individual beltmakers. While they\u2019re not as definitive to my untrained eye as fingerprints would be to an investigator, the workmanship is clear to Millican, who has been laboring with Reggie and collecting his championship belts. Very few of these belts were made. And when Dave has asked Reggie about them, Parks has said he simply doesn\u2019t remember making them\u2013not the original nor the copy. He\u2019s made so many belts for so many people that it\u2019s a believable answer\u2013although a disappointing one.\n\nWhile a championship belt has always been a prop, not until fairly recently have audiences come to grasp that fact. The championship belts are, at their core, a combination of zinc, electroplated metal (gold and/or nickel), and leather. But wrestlers took them seriously. So did fans. Now WWE has multiple copies of its championship belts: For house shows, for appearances, for shows broadcast on TV and the Network. And that\u2019s not even counting the replicas fans can buy at merchandise stands or online. Titles were held up as a goal for a wrestler to reach\u2013a reason to fight, a prize for which to contend. And so modern titles have lost some of their luster. The Ultimate Warrior once said that the WWE championship belt was \u201cjust one more thing to carry\u201d through the airport. But fans still remember when those championship belts meant something\u2013if not in the wrestlers\u2019 eyes, at least in their own.\n\nThat\u2019s what I\u2019ve learned about the championship belts as I visit with Dave."} +{"output_text": " to the temples and the government can also take up a project to restore the missing icons,\u201d he added.\n\n\u201cThe government should also take up a project to restore the missing icons. The government should also take up a project to restore the missing icons. The government should also take up a project to restore the missing icons. The government should also take up a project to restore the missing icons. The government should also take up a project to restore the missing icons. The government should also take up", "input_text": "alore.\n\n\u201cAir pollution can also compound the problem with thick layers of dust and soot getting deposited over time in the outdoors, especially as the sculptures are near a car parking lot, emitting a lot of soot and particulate matter,\u201d she added.\n\nT. Satyamurthy, former Superintending Archaeologist of the ASI, said stone idols, pillars lose their features such as carving if they are exposed to sun and rain. \u201cThere will be acidity in the first few spells of rain and that acid will deposit over the stone. Stone is not made only of silica but also other minerals. These minerals will react to the acid and then stone idols will deteriorate fast,\u201d he cautioned.\n\nOn wooded pieces, Dr. Sharada said wood deteriorates easily if left exposed. Alternate exposure to wet and dry conditions can lead to swelling. Shrinkage and dryness lead to cracking and disfiguration and peeling off of painted layers especially on objects such as vahanas. If wood becomes very waterlogged, it needs to be treated for drying out, with some methods such as bathing in ethanol and then in acetone and so on, she explained.\n\nDelhi\u2019s solution\n\n\u201cThe Archaelogical Survey of India has opened an exclusive museum to display restituted and seized artefacts at Purana Quila in New Delhi. The Tamil Nadu government can do something similar for objects that cannot be immediately traced to temples,\u201d said S.Vijaykumar, art enthusiast.\n\n\u201cThe artefacts especially the bronzes need to be immediately assessed for potential bronze disease and remedial measures must be taken up. Further as per the submission by HR & CE Department, they have a list of temples with 1,450 stone and bronze icons reported missing from 1992. These seized idols can be distributed"} +{"output_text": " of cocaine. The most common neurological complications are seizures, stroke, and acute confusional state. Cocaine-induced seizures are usually generalized tonic-clonic seizures, but focal seizures have also been reported (154). Cocaine-induced stroke is a rare complication, but it has been reported in association with cocaine abuse (155). Cocaine-induced acute confusional state is a common complication of cocaine abuse and is characterized by a sudden onset of confusion, agitation, and delirium.", "input_text": "ostimulant. The natural source of cocaine in the form of coca plant leaves has been known for centuries. In 1859, an active alkaloid cocaine was isolated by Albert Niemann of Germany (144). For the following half a century, it was mostly used for medical purposes as a local anesthetic and treatment of depression but thereafter it became increasingly popular as a drug of abuse. Cocaine is a Schedule II drug.\n\nClinically evident effects of cocaine are largely dose dependent with correlations to plasma levels. At the same time, large differences between individuals are present. Clinical manifestations may vary by the route of administration, purity of sample, and duration of cocaine abuse. Its stimulant effects manifest as agitation, euphoria, and hyperthermia (145). A variety of movement disorders have been reported in association with cocaine abuse. Slow frequency (<8 Hz) hand tremor has been described in abstinent cocaine abusers (146). Cocaine may induce tics or exacerbate Tourette\u2019s syndrome (147, 148), and it has also been reported to cause punding and opsoclonus\u2013myoclonus (149). Dystonic reaction has been reported in children after accidental exposure to cocaine in their home environments (150). Neuroleptic malignant syndrome following delirium has rarely been described in the acute stage, as well as persistent parkinsonism following a 3-month abstinence from the drug (151). A rare syndrome of fulminant encephalopathy with manifestations of seizures, bradykinesia, myoclonia, and bilateral MRI hyperintensities in basal ganglia has been described in HIV-positive cocaine abusers (152). The use of potassium permanganate in the processing of coca-leaf extraction can also lead to manganese intoxication (153).\n\nNeurological complications are more common and severe with the smokeable alkaloidal form"} +{"output_text": " talent. The studios are looking for the best talent and the best deals. The unions are looking for the best deals and the best talent.\u201d\n\n\nThe WGA has been in a contract dispute with the studios for more than a year. The guild has been trying to get a new contract for writers for years, but the studios have been reluctant to negotiate.\n\nThe WGA has been in a contract dispute with the studios for more than a year. The guild has been trying to get a", "input_text": " and looking at what they have in development and production,\u201d said Kagan Bierman. \u201cThey\u2019re appropriately and strategically analyzing their slates so that a strike doesn\u2019t cripple the entertainment business. It is not simply about stockpiling scripts, which often has a negative connotation. It is also about whether to move forward with projects in development and what type of programming to focus on developing and producing.\u201d\n\nThe streaming revolution has brought new tech giants into the Hollywood market, such as Amazon and Apple. They are hungry for new content but don\u2019t have a history of dealing with unions.\n\n\nThen there\u2019s Netflix, which is not a member of the producers alliance. The Los Gatos company has already started forging its own labor deals, including a landmark contract with SAG-AFTRA this summer, that could give it a competitive advantage in the event of a work stoppage\n\nThe ability to negotiate a separate deal with Netflix also gives WGA and other talent unions some leverage with the studios. Contracts for SAG-AFTRA and the Directors Guild of America expire June 30.\n\nAnother change over the past decade is the consolidation among legacy media companies, with the formation of AT&T-Warner Media, Viacom-CBS, Disney-Fox and Comcast-NBC. These corporate behemoths may take a harder line with unions and have more resources to fight the WGA and other guilds. At the same time, they need talent more than ever to feed their new streaming pipelines in order to compete with Netflix.\n\n\u201cThey both need each other in the end, but when there is some consolidation it can lead to a little more leverage,\u201d said David Smith, a professor of economics at the Pepperdine Graziadio Business School. \u201cThere is almost a frenzy in terms of production and the need for"} +{"output_text": " difficult to determine the best license for a piece of software that is being developed for a specific purpose. The Open Source Initiative (OSI) has attempted to address this issue by creating a set of principles for open source software licensing [21]. The OSI\u2019s principles are intended to be a guide for developers and users of open source software, but they are also relevant to the software licensing of academic and research software. The OSI\u2019s principles are:\n\nThe license should be compatible with", "input_text": ", why using a FOSS license does not preclude commercialization (see above), why you think commercialization is not the most appropriate goal for your work, or why broad dissemination is an important goal for you. If you wish to propose a license that limits or forgoes the potential for generating revenue, you may first have to convince your TTO staff that your work lacks commercial value. While the process can sometimes be a bit of a negotiation, most institutions care a great deal about the scientific and societal impact of their IP, and we find that it is rare for an institution to act contrary to the express wishes of the creator of a work. Knowing what you want and why you want it should go far in making the licensing process as painless as possible.\n\nThe Complication of Software Patents An additional reason to contact your TTO before applying a license is software patents. Modern TTOs arose following the Bayh-Dole Act of 1980, which allows US research institutions to patent inventions developed using public funds and to license those patents [19], [20]. Because the vast majority of academic and research inventions are unlikely to have significant commercial value, most are never patented, but institutions typically require the disclosure of any patentable invention to the TTO. Many FOSS licenses (like the BSD or MIT licenses) are agnostic regarding patents, while some explicitly include patent grants in the license text (like the Apache or GPL licenses) (Table 1). Software patents are highly complex and generally outside the scope of this guide, but be aware that your TTO will want to discuss patent strategy, as well as copyright.\n\nSoftware Licensing and the Open Culture of Science The needs and obligations of academic and publically funded research create unique considerations for scientist-programmers choosing a software license. Unlike in the software industry, where licensing strategy is primarily a matter of business strategy, it can be"} +{"output_text": "I\u2019m not saying I\u2019m going to vote for Bernie, but I\u2019m not going to vote for Trump. I\u2019m not going to vote for Hillary. I\u2019m not going to vote for anyone. I\u2019m going to vote for the idea of voting. I\u2019m going to vote for the idea of voting for the person I think will do the best job. I\u2019m going to vote for the idea of voting for the person who will do the best job of", "input_text": " why I voted for Hillary in the 2016 primary). And Joe Rogan endorsed him. With Bari Weiss on as his guest. And he\u2019s holding a rally with Vampire Weekend next month. I\u2019m not saying the cure is worse than the disease, but it\u2019s certainly got no shortage of side effects. If I support Bernie long enough I\u2019ll probably end up affecting his Frankenstein underbite. Fire bad!\n\nWhy he\u2019ll suck as president\n\nBernie would be 79 years old on his inauguration day, beating Trump as the oldest inaugurated president in U.S. history. He\u2019s really fucking old. And I know it was nice to see him quickly recover from a heart attack in October and get right back out on the trail, but that doesn\u2019t mean Bernie is tough. It means he\u2019s at the age where spontaneous heart failure can happen anytime, anywhere. He won\u2019t tap Marianne Williamson as his running mate, but he sure as hell better tap someone who adheres to her health and skincare regimen.\n\nAnd even if Bernie does not drop dead while in office, he\u2019s still not gonna accomplish anything useful. Bernie is a pleasant figurehead for a movement that needs a lot more than that.\n\nOne unhateable thing\n\nHis whole \u201cfuck you we\u2019re going to win\u201d campaign ethos? I like it.\n\nHe\u2019s gonna win this thing. I just hope he gets his shit together while he does. \u201cMy feet, arms, and legs are so sore,\u201d I tell my wife over dinner. She laughs and tells me I haven\u2019t left home, gone to the gym, or done anything other than write and play VR games the past few days. \u201cI know,\u201d I tell her. \u201cThat\u2019s the point.\u201d\n\n"} +{"output_text": " down to the ground, where it landed with a thud.\n\nI was alone.\n\nI was free.\n\nI was Kenshi.\n\nI was a god.\n\nI was a man.\n\nI was a monster.\n\nI was a hero.\n\nI was a thief.\n\nI was a hero.\n\nI was a thief.\n\nI was a hero.\n\nI was a thief.\n\nI was", "input_text": " only seen as part of a heavily-guarded caravan, that I could trade with but not possibly hope to steal from, by force or by stealth (not yet). Why was this one alone? Its saddlebags revealed it was not wild, as some were \u2013 it belonged to someone. Were they nearby? Had they been ambushed and slain, and this one beast found its way to relatively safety during the attack?\n\nOr had the game glitched, leaving one creature, festooned with valuable building supplies, alone when it should not be?\n\nFeature or bug, feature or bug? That\u2019s a question which has repeatedly dogged my Kenshi experiences.\n\nI don\u2019t think it matters. What matters is that I took a chance and slew that beast. The fight was hard, my pair of men so weak that this exhausted creature was more than capable of kicking their heads in, but they survived, barely. No one came seeking vengeance, though for hours I winced whenever we passed anyone else. And, suddenly, I had an embarrassment of riches. Well, not compared to what would come later, but for now: enough to build a home, a farm, a training room, cooking facilities. Enough to get going, to stake a claim. A shortcut, of sorts.\n\nFeature or bug? Either, both. Whatever the cause, it felt so very Kenshi, this sudden shift, a story all my own, choices all my own. Nothing becoming everything.\n\nI carried the packbeast\u2019s corpse into town, slowly, arduously. I moved its supplies into storage crates or sold them at the bar, until the beast\u2019s packs were empty. When I finally lowered its great carcass to the ground, it launched itself at weightless speed a few hundred feet into the air, then drifted halfway"} +{"output_text": " itself on the energy of the cosmos. It\u2019s a parasite.\u2019\n\n\u2018What do you mean?\u2019\n\n\u2018It\u2019s a parasite. It feeds off the energy of the cosmos. It\u2019s a parasite. It feeds off the energy of the cosmos.\u2019\n\n\u2018What does that mean?\u2019\n\n\u2018It\u2019s a parasite. It feeds off the energy of the cosmos.\u2019\n\n\u2018What does that mean?\u2019\n\n\u2018It\u2019s a", "input_text": " pink flesh. Rows of translucent spikes ran down its body in five columns. They met up at a wrinkled sphincter at the top. The creature\u2019s body didn\u2019t move, but its spikes undulated up and down in a sad rhythm. Between the rows of spikes there were sets of large watery ovals that seemed to be eyes. A bit of yellow-green liquid, like pea soup, dripped out of one or two of them.\n\n\u2018Incredible, isn\u2019t it?\u2019 George said. \u2018We found this wonder in the cornfield back there, flopping around on the ground like a fish yanked out of water.\u2019 He held his hands together and bowed toward the cage.\n\n\u2018Where did it come from?\u2019\n\n\u2018It fell here from another glorious world. We found a thick round rubber vessel nearby.\u2019 He motioned off into the woods. \u2018Its house, I guess. If you want proof of how mysterious the cosmos are, watch this.\u2019\n\nGeorge flicked on the cattle prod and jammed it inside.\n\n*\n\nThe creature began convulsing, its spikes flailing in all directions. A sweet stench, like burnt garlic, surrounded us.\n\nThe message the creature was sending me was very clear now. Pain. Its hurt rained through my body, and by the scrunched-up look on George\u2019s face, his too. Then there was a thick ripping sound, like someone pulling apart an enormous steak with their hands.\n\n\u2018I can always feel its life force when I do that,\u2019 he said. When I looked back down, there were two creatures in the cage, identical in size and shape. They were both smaller than the one before.\n\n\u2018This creature doesn\u2019t even consume plants, much less animals. It sustains"} +{"output_text": "ly woman, that I had read the story of the \"Necronomicon\" and that it was a book that I had read and that I had found to be very disturbing. She was shocked and asked me if I had read the story of the \"Necronomicon\" and I told her that I had read it and that it was a very disturbing story. She then asked me if I had read the story of the \"Necronomicon\" and I told her", "input_text": " Cthulhu is the greatest even though not a lot happens, I love Cthulhu and the mythos but hinestly Call of Cthulhu is boring compared to the Dunwhich Horror and At the Mountains of Madness are much more intersting and eventful.\n\nXavier Luft on October 19, 2019:\n\nHaunter of the Dark is a shining trapezohedron above, his last written and one of his best written stories. Should be on the list, and the stories Robert Bloch wrote on either side of this story along with \"Notebook found in a Deserted House\" are among the greatest additions to the Cthulhu Mythos.\n\nZeron87 on October 02, 2019:\n\nMy favorite Lovecraft story is The Lurking Fear. I consider myself a lover of horror fiction, but Lovecraft's style, for the most part, turned me off, since I'm more of a \"Show me how it happened\" than a \"Tell me what happened\" type of guy, but The Lurking Fear had enough action pulling the story along to keep me interested. Also, something I barely notice people mentioning on when describing Lovecraft's tales: I may have abhorred reading it, but At The Mountains of Madness was a grand, encompassing tale connecting all of Lovecraft's mythos monsters together. Who else wants a LCU (Lovecraft Cinematic Universe)? In anycase, great article.\n\nDale Anderson from The High Seas on August 08, 2019:\n\nSo you read HPL for shocks, surprises, terror and amazement do you? Well let me share with you the most ALARMING Lovecraft story that ever was! Read on if you dare... I once mentioned to a lady friend of mine, a responsible, intelligent,well-read and world"} +{"output_text": "]. 18 http://www.w3.org/MarkUp/VRML/ 19 http://www.w3.org/MarkUp/VRML/ 20 http://www.w3.org/MarkUp/VRML/ 21 http://www.w3.org/MarkUp/VRML/ 22 http://www.w3.org/MarkUp/VRML/ 23 http://www.w3.org/MarkUp/VRML/ 24 http://", "input_text": " participants [133, pp. 173\u2013174], thereby creating a distributed human-powered computer. 6 http://british-legends.com/CMS/index.php/about-mudi-bl/history 7 Similar concerns were also raised by Harold Thimbleby, who presented at the Artificial Life II workshop but did not appear in the proceedings; his ideas were later published elsewhere [129]. 8 http://www/karlsims.com/genetic-images.html. Like Evolved Virtual Creatures, this work also ran on a massively parallel Connection Machine supercomputer. 9 http://www/karlsims.com/galapagos/ 10 See also http://www.win-vector.com/blog/2009/06/what-is-genetic-art/. 11 http://www.mzlabs.com/MZLabsJM/page4/page4.html 12 The work on Web-based evolutionary art systems reported here is part of a larger field of work on evolutionary art, much of which is not Web-based. A good review of the wider field can be found in [52]. 13 At the time of writing, a new version of TechnoSphere, in the form of an augmented-reality mobile app, is currently under development (see Section 6.2.2). 14 http://goldberg.berkley.edu/garden/ 15 http://web.archive.org/web/19980203215817/http://telegarden.aec.at/html/nyt.html 16 http://www.w3.org/MarkUp/VRML/ 17 Example images can be seen at http://www.ventrella.com/Tweaks/Absolut/absolut.html. The work is also described in [133"} +{"output_text": " a inventar o termo computador.\n\nA ideia de que os computadores podem ser usados para resolver problemas \u00e9 antiga. Mas a primeira m\u00e1quina que se tornou conhecida como computador foi a m\u00e1quina de Turing, que foi constru\u00edda em 1936. A m\u00e1quina de Turing era uma m\u00e1quina de computa\u00e7\u00e3o, que era capaz de calcular o valor de uma equa\u00e7\u00e3o matem\u00e1", "input_text": " a carne mo\u00edda, o tomate, as folhas de massa e a sa\u00edda seria a lasanha perfeitamente gratinada. \u201cMas nessas tarefas muitas vezes existe a influ\u00eancia da habilidade das pessoas que cozinham: n\u00e3o \u00e9 a mesma coisa uma receita preparada por um grande chef, que pode inclusive melhor\u00e1-la, do que por um principiante\u201d, matiza Miguel Toro, professor do Departamento de Linguagens e Sistemas Computacionais da Universidade de Sevilha. Na verdade, os algoritmos re\u00fanem opera\u00e7\u00f5es t\u00e3o simples que podem ser realizadas com sucesso por qualquer um. Inclusive por m\u00e1quinas. Aqui est\u00e1 o cerne da quest\u00e3o.\n\nAlgoritmos + computadores = revolu\u00e7\u00e3o\n\nPorque embora os algoritmos existam pelo menos desde o tempo dos babil\u00f4nios, com a chegada dos computadores eles ganharam muito mais destaque. A uni\u00e3o de m\u00e1quinas e algoritmos \u00e9 o que est\u00e1 mudando o mundo. O matem\u00e1tico brit\u00e2nico Alan Turing, famoso por ter decifrado a m\u00e1quina Enigma de mensagens cifradas dos nazistas e por ter se suicidado mordendo uma ma\u00e7\u00e3 envenenada depois de ter sofrido uma dura persegui\u00e7\u00e3o por causa de sua homossexualidade, foi um dos primeiros a relacionar algoritmo e computadores. Na verdade, foi um dos primeiros a imaginar um computador como o conhecemos. Inclusive chegou"} +{"output_text": " in chief,\u201d Harris said. \u201cHe is not only failing to lead, he is failing to listen. He is not only failing to lead, he is failing to respect the Constitution. He is not only failing to lead, he is failing to respect the values of this country. And he is not only failing to lead, he is failing to respect the people of this country.\u201d\n\nAD\n\nHarris's answer was more direct than Buttigieg's, but it was also more measured.", "input_text": "folding process that betrays the core strength of America, which is that we're a country that had been perceived as a country that could be counted on, to our allies and our adversaries,\u201d Buttigieg said when asked, even before the president ordered a full troop withdrawal, by reporters about the Trump administration standing down in Syria. \u201cWhen that happens, there is a terrible price to be paid. And it makes America less safe. By the way, we're also seeing ISIS fighters freed, so this is also undermining our own interests.\u201d\n\nAD\n\nHarris's \u201celectability' approach is more direct, calling out \u201cthe donkey in the room\u201d and saying she may have slipped out of the headlines because, once again, a nonwhite candidate is being underrated. Her Saturday event, which featured free barbecue and a DJ playing \u201cCupid Shuffle,\u201d drew out a few hundred voters, including dozens of African Americans \u2014 a contrast with Buttigieg. And she talks much more about her record than Buttigieg, describing \u201cthe Dark Ages\u201d of anti-crime hysteria when she was elected San Francisco's district attorney (critics said she should be \u201clocking people up\u201d if she wanted to win again) and her six years as California's attorney general (\u201cWhen I took on the five big banks of the United States, people said to me, calmly, your career will be over.\u201d)\n\nBut both candidates are asked to respond to events that aren't shaped by Democratic candidates \u2014 much less the candidates looking for a foothold while polling at single digits. Like Buttigieg, Harris was prompted to talk about the president's moves in Syria. Her answer ended in a similar place, but took a longer journey there.\n\nAD\n\n\u201cIt is, yet again, another display of Donald Trump's inability to understand the role and the responsibilities of commander"} +{"output_text": " and then we were gone.\n\nI\u0092ve never been able to shake the feeling that I was somehow responsible for his death. I\u0092m not sure why. I\u0092m not sure I want to know.\n\nI\u0092ve never been able to shake the feeling that I was somehow responsible for his death. I\u0092m not sure why. I\u0092m not sure I want to know.\n\nI\u0092ve never been able to shake the feeling that I was somehow responsible for", "input_text": " nominal cost. I eventually ended up with four (12) of my own, plus several (13) which I gave to women friends. Much of each meeting was taken up with mundane conversations about life and philosophy, and we always had lunch brought to us by guards -- a decent salad, bread and butter, milk, and overcooked meat or fish. Gacy would always \"pig out\" on the snacks we brought; he had a real sweet tooth, and the snacks were things he didn\u0092t get elsewhere. We\u0092d stay until about 2 PM, have one of the guards take some Polaroid photos at a dollar apiece (one of each pose for us and one for Gacy), and then dash madly for the St. Louis airport for a late afternoon flight back home.\n\nThis gives you an idea of the structure of our trips. The last of the four trips was the most remarkable. We saw him on Monday a week before his execution and were his last visitors other than family and appellate attorneys. He\u0092d called and written more frequently in the last few weeks and was plainly nervous, but still full of the old braggadocio. (14) He talked vaguely about an unnamed donor who was going to give him half a million dollars to fund another round of appeals. It all sounded possible, but I when I saw him in person I knew he was just blowing smoke. Ever the con artist, he almost had me convinced, but his unhealthy, beet red complexion and copious sweating even in air conditioning gave it all away. It was and always had been bullshit. He was going down and damn well knew it. He knew I knew it too. I almost felt sorry for him, but the looming image in my mind of his lifetime of lies wouldn\u0092t allow it. I listened quietly, shook his clammy hand when it was time to leave,"} +{"output_text": " commission is not a government agency that is supposed to be a neutral arbiter of the public interest. It's a government agency that is supposed to be a government agency that is supposed to be a government agency that is supposed to be a government agency that is supposed to be a government agency that is supposed to be a government agency that is supposed to be a government agency that is supposed to be a government agency that is supposed to be a government agency that is supposed to be a government agency that is supposed", "input_text": " freshly minted capable of 30 (spec'd for 42 down) megabits.. paltry 8 megabit DSL was no longer adequate.\n\nARGONAUT\n\nHave a nice day.\n\nPremium Member\n\njoin:2006-01-24\n\nNew Albany, IN ARGONAUT Premium Member > Cable companies know they have you.\n\nfiber_man\n\nThings Happen For A Reason\n\nPremium Member\n\njoin:2001-01-27\n\nPort Saint Lucie, FL fiber_man Premium Member regulation thank Judge Greene for this mess he started in 1984.\n\nthings were a lot better when the government had oversight of these companies. tanzam75\n\njoin:2012-07-19 tanzam75 Member Re: regulation said by fiber_man: thank Judge Greene for this mess he started in 1984.\n\nthings were a lot better when the government had oversight of these companies.\n\n\n\n\n\nThe Bell System myth of uniform excellence in service is exactly that -- a myth. Some Bell operating companies were better than others, and some areas were better-served than others. In 1984, you could find areas with electronic switching and private lines to every household. You could also find areas -- in Bell territory, not at some tiny independent rural telecom -- that had party lines and crossbar switching systems.\n\n\n\nFor example, Pacific Bell was one of the more neglected companies in the Bell System. The California public-utilities commission had a highly adversarial relationship with Pacific Bell, so Ma Bell preferred to direct its money towards friendlier locales. California became the neglected stepchild of the Bell System, where they did the minimum that they could get away with.\n\n\n\nIn other words, the golden days of the regulated past are hardly the panancea that they're made out to be. A public utilities"} +{"output_text": " it.'\"\n\nI was impressed. I was also a little bit scared.\n\n\"I'm not going to do it for you,\" I said. \"I'm not going to do it for anyone.\"\n\n\"Well, I'm not going to do it for you,\" Huckabee said. \"But I will do it for Keith Richards.\"\n\nI was impressed. I was also a little bit scared.\n\n\"I'm not going to do it for you", "input_text": " up drooling, your brain gone, riding a back seat on the bandwagon that suddenly has him charging toward the lead in the GOP race.\n\nIt almost happened to me a few months ago at a fundraiser in Great Falls, Virginia. I'd come to get my first up-close glimpse of the man Arkansans call Huck, about whom I knew very little -- beyond the fact that he was far behind in the polls and was said to be very religious. In an impromptu address to a small crowd, Huckabee muttered some stay-the-course nonsense about Iraq and then, when he was finished, sought me out, apparently having been briefed beforehand that Rolling Stone was in the house.\n\n\"I'm glad you're here,\" he told me. \"I finally get to tell someone who cares about Keith Richards.\"\n\nBefore I could respond, Huckabee plowed into a long and very entertaining story -- one that included a surprisingly dead-on Pirates of the Caribbean-esque impersonation -- about how Richards and Ron Wood got pulled over for reckless driving while on tour in Fordyce, Arkansas, a million and a half years ago, in 1975. Richards ended up getting a misdemeanor conviction -- an injustice that stood for thirty-one years, until Huckabee, a would-be rock musician himself, stepped in and pardoned Richards last year.\n\n\"It's a long process, pardoning,\" Huckabee said, placing a hand on my shoulder. \"It takes a lot of paperwork. And the funny thing is, people said to me afterwards, 'Governor, you'll do that for Keith Richards, but you wouldn't do that for an ordinary person.' And my answer to that is always, 'Hey, if you can play guitar like Keith Richards, I'll consider"} +{"output_text": ", TaeJa also participated in the IGN ProLeague Season 4, where he was eliminated in the group stage by the eventual champion, ByuN.\n\nTaeJa's first tournament after the IGN ProLeague Season 4 was the IPL 4, which was held in Los Angeles, California. He was seeded into the open bracket, and made it to the round of 16, where he was defeated by the eventual champion, MC. TaeJa then travelled to Korea to compete", "input_text": "-Terran match-up, as well as TheStC and the former GSL champion jjakji. TaeJa was barely defeated by the latter in his first match, but managed to turn the tables as he overwhelmed DongRaeGu 2-0 and beat jjakji in their second encounter, therefore securing a spot in the round of 16.\n\nIncreased Presence in the Foreign Scene (Apr \u2014 Jul 2012) [ edit ]\n\nStill inexperienced in sprint-like tournaments (the IPL 4 being the second of the kind he ever attended after the Assembly Winter), TaeJa failed to reach the championship bracket of the competition in Las Vegas.\n\nIn early April, TaeJa travelled to Las Vegas in order to compete in the IGN ProLeague Season 4 as an open bracket sign-up, which was the first North American tournament he ever attended. He made a good start, proceeding to his bracket finals at the expense of Killer and Maru, but failed to earn a spot in pool play as he was defeated by jjakji in this last match. TaeJa eventually dropped out the tournament after he was beaten by ByuN in their losers' bracket finals. At the end of the month, TaeJa played his 2012 GSL Season 2 Code S round of 16 group stage featuring the reigning, two-time MLG champion MarineKing, TheStC, and the Protoss-versus-Terran expert PartinG. He first met MarineKing, and was beaten 2-0, but nonetheless made it to the main bracket as he defeated TheStC and took revenge on MarineKing on the same mark. TaeJa's run in the GSL came to an end in early May, when he was crushed 3-0 by Squirtle in the quarterfinals. During the same period"} +{"output_text": "kyyn, joka oli h\u00e4nen mukaansa yksi h\u00e4nen suurimmista saavutuksistaan. H\u00e4nen mukaansa py\u00f6veli oli kuin kirjasto, jossa oli kaikki tietoja, joita py\u00f6veli tarvitsi.\n\n\u2013 Py\u00f6veli oli kuin kirjasto, jossa oli kaikki tietoja, joita py\u00f6", "input_text": "ietoja yli 90 ty\u00f6matkasta, mestasihan mies yli sata ihmist\u00e4. Hakalainen teki tarvittaessa keikkoja my\u00f6s oman \u2013 tuolloin hyvin ison \u2013 l\u00e4\u00e4nins\u00e4 ulkopuolellekin. Toisaalta h\u00e4nen ollessaan syyst\u00e4 tai toisesta estynyt omien t\u00f6idens\u00e4 hoidosta Pohjanmaalla keikkaili useampikin py\u00f6veli muista l\u00e4\u00e4neist\u00e4.\n\nHakalainen sinnitteli virassaan ik\u00e4miehen\u00e4, vaikka h\u00e4nen toimintakykyns\u00e4 arvellaan loppuvuosina heikentyneen. Tutkija Mikko Moilanen katsoo sata vuotta my\u00f6hemmin py\u00f6velin\u00e4 ty\u00f6skennelleen Niilo R\u00f6nbladin ty\u00f6m\u00e4\u00e4r\u00e4n vet\u00e4neen vertoja Hakalaisen urakalle.\n\nSaarij\u00e4rvelt\u00e4 l\u00e4ht\u00f6isin oleva Pohjanmaan R\u00f6nblad teki viimeiset ty\u00f6matkansa yli 70-vuotiaana, ja koko uransa aikana h\u00e4nelle ehtikin kerty\u00e4 ty\u00f6matkoja l\u00e4hes 100 000 kilometrin verran. Pelk\u00e4st\u00e4\u00e4n vuonna 1761 R\u00f6nblad vietti tien p\u00e4\u00e4ll\u00e4 yhteens\u00e4 l\u00e4hes kolme kuukautta ja kilometrej\u00e4 taittui tuona vuonna 4 000.\n\nKuoleman pel\u00e4ttiin tarttuvan Harrastajahistorioitsija Mikko Moilanen\n\nPy\u00f6velill\u00e4 oli oikeus yhteen ky"} +{"output_text": ", I am more convinced than ever that Maslow\u2019s model is the best way to understand human motivation. It is a model that is based on a lot of research and has been tested in a lot of different situations. It is also a model that is flexible enough to account for the unique human traits that have evolved through culture.\n\nI\u2019m not saying that Maslow\u2019s model is perfect, but I am saying that it is the best model we have right now.\n\n", "input_text": " biologically influenced and provides a lot of strong insight, backed by a huge amount of research, on some of our fundamental needs and motivations.\n\nHowever, not everyone is totally on board with this model. As soon as it appeared, so did critiques, and their main issue was the fact that this model fits all animals but forgets unique human traits that have evolved through culture, as Maslow tried to address with self-actualization. In one particularly salient commentary, Kesebir et. al. (2010) highlight a number of ways in which human culture has contributed to our evolution and began to introduce motivations that are a little less Darwinian and a little more human. They point out that research has shown a lack of innate need for parenting in a number of situations and postulate that the drive for sex alone is enough to ensure procreation, due to the lack of birth control in human history. They then move into what comes after \u2014 successfully raising children has not represented the end of life for humans for many years. They suggest based on psychological research that after parenting comes an innate need for wisdom and \u2018meaning in life\u2019, which has been demonstrated universally across cultures. This is also a need which is not only driven by but also often fulfilled by culture, through means such as community, art, religion, or scholarly pursuits.\n\nKesbir et. al. are not the only ones who just wouldn\u2019t give up on self-actualization. A number of other people have been down the same path and been unable to settle with the fact that once parenting has been accomplished, all human needs are met. In fact, the article itself was originally published alongside 4 critiques, including that of Kesbir et. al.\n\nSo can I give a conclusive and definite answer here? Definitely not. But I can absolutely say that after reviewing all of this theory"} +{"output_text": " with a broken ankle, broken wrist, or broken foot.\n\n\n\nYou\u2019ll get a concussion.\n\n\n\nYou\u2019ll get a dislocated shoulder.\n\n\n\nYou\u2019ll get a broken collarbone.\n\n\n\nYou\u2019ll get a broken wrist.\n\n\n\nYou\u2019ll get a broken ankle.\n\n\n\nYou\u2019ll get a broken foot.\n\n\n\nYou\u2019ll get a concussion.\n\n\n\nYou\u2019ll get a dislocated shoulder.\n\n\n\nYou\u2019ll", "input_text": " Iraqi reconstruction in his first book, We Meant Well: How I Helped Lose the Battle for the Hearts and Minds of the Iraqi People. A TomDispatch regular, he writes about current events at his blog, We Meant Well. His next book is Ghosts of Tom Joad: A Story of the #99Percent. To stay on top of important articles like these, sign up to receive the latest updates from TomDispatch.com here.\n\nFollow TomDispatch on Twitter and join us on Facebook or Tumblr. Check out the newest Dispatch Book, Ann Jones\u2019s They Were Soldiers: How the Wounded Return From America\u2019s Wars\u2014The Untold Story. 11 and half minutes is finite.\n\nNo matter what you\u2019re afraid of. Falling down. Running out of energy and instantly collapsing in a coma. Legs turning to dust. Failing to do something well. Every runthrough will end in no more than 11 and a half minutes. All comes to pass.\n\nYou have a greater chance of dying in a marching accident than in a plane crash. Or a shark attack.\n\nDrum corps is ABSURDLY dangerous. You run around a field with up to 50 pounds of equipment, multitasking for 11 and half minutes with 149 other people, and your drill writer literally writes dots inches apart from one another.\n\n\n\nIf you\u2019re in pit and don\u2019t march, you load and unload thousands of pounds of equipment from a truck 700 times a day.\n\n\n\nSometimes it\u2019s 110 degrees outside.\n\n\n\nAlso, there are thunderstorms and most of your equipment conducts electricity.\n\n\n\nIf you survive even one year of this, you have a guardian angel.\n\n\n\nThere is a lot of crying in drum corps.\n\nYou\u2019ll injure yourself and march"} +{"output_text": " country to various distributors. The distributors are usually connected to the cartel through a third party, such as a drug dealer or a corrupt police officer. The distributors are responsible for selling the drugs to street dealers, who sell them to customers. The street dealers are responsible for selling the drugs to customers. The street dealers are responsible for selling the drugs to customers. The street dealers are responsible for selling the drugs to customers. The street dealers are responsible for selling the drugs to customers. The street dealers are", "input_text": ",000 in profit. Joe\u2019s meth is made in a superlab concealed in a small warehouse in Michoachan. The two largest deep-water ports in Mexico are in Michoachan, which makes it easier to unload precursor chemicals shipped from India or China. The Knights Templar cartel controls the lab, which required several hundred thousand dollars of initial investment. The lab produces 50 kilos a week and is run by two guys, one with a degree in chemical engineering and another with a decade of experience cooking meth. Once the 50 kilos leave the lab, they must pass through the territory of the cartel Pacifico Sur and the Sinaloa cartel.\n\nThe 50 kilos are divided into five loads of ten kilos, each of which pays a $7,000 tax at each of four tax points. The loads end up in the hands of a Tijuana cartel that specializes in smuggling drugs across the border. To make the crossing, the cartel might try to bribe a customs official with the lure of $10,000 a week. For someone making $31,000 per year, this is often an irresistible offer. If a bribed official is working the day of the crossing, the driver might get a call telling him to be in lane six between one and two PM on Saturday. Without a bribed official, a halcyon, or eagle, will watch the entry point with binoculars for patterns and opportunities. If he notices the official in lane four is fighting with his wife on the phone and waving all the cars through without inspection, he\u2019ll radio this information to the driver.\n\nThe shipments are taken to a stash house in a suburb of Los Angeles controlled by a high-level member of the Michoachan cartel. From the stash house, they are shipped around the"} +{"output_text": " recht auf den Punkt, denn die meisten Laptops haben schon l\u00e4nger USB-C-Buchse. Und auch bei den meisten Notebooks ist es noch nicht so weit.\n\nAber auch bei den meisten Desktop-PCs ist es noch nicht so weit. Und das ist auch gut so. Denn die neue Technik ist nicht nur f\u00fcr die neuen Ger\u00e4te gedacht, sondern auch f\u00fcr die al", "input_text": " fire thanks to Deathstroke\u2019s Mirakuru army laying siege to the streets, but the worst of circumstances ends up bringing out the best in Oliver. He stays true to this season\u2019s \u201cno killing\u201d rule and the reveal that Slade is being left in a new prison on Lian Yu feels like a poetic and full circle end (for now) to one of Arrow\u2019s best villains. And while the season 1 finale left every character in chaos, season 2 ends with a clear path for each character heading into the new season. Plus, Lyla fires a grenade launcher, so how can you not put this episode at No. 1?! \u2014S.B. Von Helmut Martin-Jung\n\nRein statistisch stehen die Chancen 50:50, aber gef\u00fchlt ist es doch eher so: Man fummelt mit dem USB-Stecker an einem Laptop oder Handy herum und merkt dann: Mist, doch andersrum. Das ist irgendwann auch den Verantwortlichen f\u00fcr den Standard aufgefallen, zumal da Apple f\u00fcr seine iPhones schon l\u00e4nger Buchse und Stecker entwickelt hat, bei denen es egal ist, wie herum man sie einsteckt. Die USB-Leute haben aber nicht blo\u00df endlich den Stecker drehbar designt. USB-C, so der Name des neuen Formats, hat auch noch viele andere F\u00e4higkeiten bekommen.\n\nDoch durchgesetzt hat sich die neue Technik bei weitem nicht so schnell wie erhofft. Nur bei Smartphones haben nun nahezu alle h\u00f6herwertigen Ger\u00e4te die neue Buchse. Bei Laptops kommt die Sache erst"} +{"output_text": " told.\n\nThe former Labour MP, who died in January, was accused of sexually abusing boys in the 1970s and 1980s.\n\nThe inquiry was told that Lord Janner, who was a Labour peer, was a \u201cserial child abuser\u201d who \u201cexploited his position of power to sexually abuse children\u201d.\n\nThe inquiry was told that the alleged abuse took place in the 1970s and 1980s, and that the alleged victims were boys aged between eight and 16", "input_text": " organizations to confirm diagnoses, like public health authorities, to do the validation. Apple and Google said they want users to trust the system, and that includes users knowing that the system is reliable.\n\nHow is a confirmed COVID-19 case identified?\n\nApple and Google point out that while a positive test result is likely the best means of identifying a case, it isn\u2019t necessarily the only way. It\u2019s true that a diagnosis by a medical professional doesn\u2019t actually require a confirmed positive test result specifically identifying the presence of the virus \u2014 theoretically, a public health agency could set a lower bar, requiring just a diagnosis based on symptom presentation, for instance.\n\nBoth tech giants concede that for contact tracing to be effective, there needs to be a high degree of case identification within a population, but left the door open to the possibility that a high degree of case identification doesn\u2019t necessarily translate one-to-one to widespread testing, should other means of identifying cases be deemed reliable enough by local health authorities in any given area.\n\nShould you trust this system?\n\nThere\u2019s no easy answer. It seems like Apple and Google have made a system that\u2019s better than nothing, but it\u2019s a system that requires considerable user trust. You have to trust that Apple and Google have built a system that can withstand abuses \u2014 either from themselves or governments. But no system is foolproof or immune to abuse. If you don\u2019t trust the system, you do not have to use it.\n\nAn earlier version of this report incorrectly stated the Android 4.1 versions and higher that will get the update. It\u2019s Android 6.0 and above. We regret the error. Lord Janner is alleged to have exploited children to commit a 'full range' of sexual offences against them dating back to the 1950s, the public inquiry into child abuse has been"} +{"output_text": " Sacramento is a more balanced team, the Thunder are the more talented squad.\n\nRussell Westbrook is the league\u2019s most efficient scorer, and he\u2019s averaging a career-high in points per game. Victor Oladipo is a top-five shooting guard in the league, and he\u2019s averaging a career-high in points per game.\n\nThe Thunder are also the league\u2019s best rebounding team, and they\u2019re the league\u2019s best defensive", "input_text": " for Sacramento\u2019s locker-room morale.\n\nMost recently, LeBron James and Cleveland foiled the Kings\u2019 attempt to regain momentum. Dave Joerger\u2019s unit was overmatched 120-108 by a well-balanced Cavalier grouping.\n\nOne bright spot from that loss was DeMarcus Cousins\u2019 (26) and Rudy Gay\u2019s (23) combined 49-point eruption. However, the NBA\u2019s fifth-worst defense showed profound lapses in wake of Cleveland\u2019s non-stop offensive barrage.\n\nTonight\u2019s matchup vs. OKC is a pivotal marker along Sacramento\u2019s quest to eradicate an eleven-season playoff drought. As such, look for the Kings to come out with intent urgency on their home floor.\n\nINJURY REPORTS/PROJECTED STARTING LINEUPS\n\nBoth the Oklahoma City Thunder and Sacramento Kings are injury free heading into tonight\u2019s contest.\n\nOKLAHOMA CITY THUNDER\n\nPoint Guard: Russell Westbrook\n\nShooting Guard: Victor Oladipo\n\nSmall Forward: Andre Roberson\n\nPower Forward: Domantas Sabonis\n\nCenter: Steven Adams\n\nSACRAMENTO KINGS\n\nPoint Guard: Darren Collison\n\nShooting Guard: Garrett Temple\n\nSmall Forward: Anthony Tolliver\n\nPower Forward: Rudy Gay\n\nCenter: DeMarcus Cousins\n\nODDS\n\nOKLAHOMA CITY THUNDER -1.5\n\nThe Vegas books list Oklahoma City as a slight favorite in their second meeting of the season with the Sacramento Kings. Conversely, our partners at numberFire give OKC a sound 69% chance of knotting the score with their Western Conference rival.\n\nWhile Oklahoma City plays with greater pace, and"} +{"output_text": " minutes) rather than distance (e.g., 1 km). This may be a more accurate representation of the environment experienced by older adults, who may be more likely to walk for longer distances [115].\n\nThe associations between perceived and objective measures of the built environment and PA were generally similar, with the exception of the association between perceived aesthetics and walking. This may be explained by the fact that aesthetics is a subjective measure that is more likely to be influenced by individual perceptions. In contrast, the", "input_text": " reporting objective PA findings used accelerometer cut-points and half of those applied an MVPA cut-point of 1952 accelerometer counts per min derived for adults [111]. As older adults have a lower MVPA cut-point due to lower resting metabolic rates [112], using the adult accelerometer cut-point likely resulted in lower estimates of MVPA, potentially masking associations. To accurately classify different intensities of older adults\u2019 PA, future research using objectively assessed PA should be underpinned by appropriate cut-points.\n\nDifferences in built environmental correlates by type of environmental measurement method\n\nOverall, there were numerous differences in the associations between built environmental attributes and total PA and walking, based on type of environmental measure. Attributes that can be classed within the functional (e.g., pedestrian infrastructure) and destination domains in Pikora\u2019s framework tended to be significantly related to PA when objectively assessed [113]. In contrast, those attributes that fall within the safety and aesthetics domains were associated with PA when perceived measures were used. This may be explained by attributes within safety and aesthetics domains being more subjective in their interpretation and thus depend on perceptions that may vary greatly between individuals. Attributes related to function and destinations are more objective and, hence, are associated with lower levels of interpersonal differences in perceptions (e.g., a pavement is either present or it is not).\n\nEffects were generally stronger for associations between the perceived environment and PA, which is consistent with previous research [114]. Unlike the objective environment, perceptions of the same neighbourhood environment can greatly differ across individuals due to differences in socio-demographics (e.g., socioeconomic status), preference, experience, culture and/or amount of walking in the neighbourhood [30]. Regular walkers may have more accurate perceptions of their local environments. Moreover, perceived measures often define neighbourhood in terms of time to reach a destination (e.g., 10"} +{"output_text": "\n\n\n\n\n\n\n\nLegendaryActivity: 1288Merit: 1001Satoshi Nakamoto Re: [ANN] Bitcoin Gold (BTG) | PoS | No premine | No IPO | No ASIC | No premine | No IPO | No ASIC | No premine | No IPO | No ASIC | No premine | No IPO | No ASIC | No premine | No IPO | No ASIC | No premine | No IPO", "input_text": " su \u201cEspa\u00f1a Invertebrada\u201d: \u201cno entiendo c\u00f3mo se puede llamar reconquista a una cosa que dura ocho siglos\u201d. Otros profesionales actualmente como el catedr\u00e1tico de la Universidad de Extremadura, Francisco Garc\u00eda-Fitz, defienden la plena vigencia del t\u00e9rmino Reconquista.\n\nA\u00fan as\u00ed en los libros de texto de los estudiantes se ha sustituido en muchos de ellos. \u201cA veces, determinados t\u00e9rminos historiogr\u00e1ficos caducan y se olvidan. Esto sucedi\u00f3 con la \u2018Espa\u00f1a musulmana\u2019, concepto que hoy ya nadie utiliza, porque definimos esa realidad hist\u00f3rica como \u2018Al-Andalus\u2019. Lo mismo podr\u00eda decirse de la Reconquista, que es perfectamente prescindible, ya que basta con aludir a la conquista cristiana de al-Andalus\u201d, precisa Garc\u00eda Sanju\u00e1n.\n\nLa sala Hip\u00f3stila con 19 naves que fue utilizada como sala de oraci\u00f3n en la Mezquita de C\u00f3rdoba SeanPavonePhoto / Getty Images\n\nQuiza la Reconquista con may\u00fascula sufra ese proceso. Por ahora lo m\u00e1s reciente es que en el \u00faltimo diccionario de la RAE la toma de Granada ha pasado de ser considerada como el ep\u00edlogo de la Reconquista a la culminaci\u00f3n. Habr\u00e1 que esperar a la pr\u00f3xima edici\u00f3n del diccionario. BITDV\n\n\n\nOffline\n\n\n\nActivity: 1288\n\nMerit: 1001\n\n\n\n\n\nSatoshi Nakamoto"} +{"output_text": " other words, the Cleanup\u2019s oceanic system could be used to clean up the land.\n\nThe Ocean Cleanup\u2019s first prototype of its oceanic system. Photo: Ocean Cleanup\n\nThe Ocean Cleanup\u2019s first prototype of its oceanic system. Photo: Ocean Cleanup\n\nThe Ocean Cleanup\u2019s first prototype of its oceanic system. Photo: Ocean Cleanup\n\nThe Ocean Cleanup\u2019s first prototype of its oceanic system. Photo", "input_text": " also worries that the Cleanup\u2019s new paper could work against the effort to change the governmental policies on land that are exacerbating the plastic pollution crisis. \u201cThere exists the potential that [the plastics industry] will look at this paper and use it to justify blaming the consumer for littering, and cities for not managing their plastic trash, and continue to deflect and reject any conversations about eliminating high polluting throwaway products or regulating smarter design standards.\u201d But, Eriksen concedes, the fact that the Cleanup has acknowledged the problem of plastic pollution sources is a step in the right direction.\n\nIn response to the question of whether this paper is a direct response to critics like Eriksen, with whom Lebreton has worked in the past, Lebreton said, \u201cWe did not need any specific advice to do this. The initial motivation was to work on designing better sources for our oceanic model.\u201d He went on, \u201cFor our work to be successful, we need to understand how plastic pollution flows around the world, [and] better mapping its sources is a logical part of that work.\u201d\n\nI was reminded of something Slat had told me last June, when I joined him on the North Sea for the launch of the Ocean Cleanup\u2019s first prototype of Slat\u2019s boom. I had asked him about the blowback from Eriksen and others. \u201cIt\u2019s not either or,\u201d he had said agitatedly. \u201cWe should do both.\u201d So, was anything in the works? Always careful to keep the Cleanup\u2019s plans close to his chest, Slat suggested there wasn\u2019t, which I remember thinking was unfortunate. But, he added, \u201cI think eventually we\u2019ll be able to develop spin-off systems of what we\u2019re doing in the ocean, which can go closer to land, or maybe in rivers.\u201d In"} +{"output_text": " the main point, the players\u2019 reaction was interesting. They were clearly surprised by the intervention, but they didn\u2019t seem to be too upset. They were also very polite and respectful to the arbiters.\n\nThe arbiters\u2019 reaction was interesting too. They were clearly upset by the intervention, but they didn\u2019t seem to be too upset. They were also very polite and respectful to the players.\n\nThe arbiters\u2019 reaction was interesting too. They were clearly upset by the", "input_text": " second hall then nobody can accuse anybody of anything and it\u2019s a lot better for everybody. The players can concentrate more, they won\u2019t get distracted. I think that\u2019s just better for them.\n\nDanny King asked some very pertinent questions. Is it actually an offence to copy somebody\u2019s moves?\n\nI suppose it must be if we think they have copied moves, but we don\u2019t think that.\n\nThe awkward point here is that Karjakin confessed to exactly that \"offence\", both in the post-game interview and in a tweet, where he also implicated others!\n\nBut can you really restrict players looking at and, if they so choose, copying moves from other players in the same playing hall? And if you want to, where do you draw the line? The players here followed each other for 3 or 4 moves after known theory had ended, but do the arbiters need to know the theory and decide how many moves after that is significant? Wouldn\u2019t the logical \u201csolution\u201d eventually be to segregate every board, at least in the most important chess events? It\u2019s an interesting precedent.\n\nAnother pertinent question was about the players\u2019 reaction. Alex comments:\n\nThey seemed to be a bit confused as to why I was intervening, but when I explained it to both of them they seemed to understand the reason for it and they were willing to move into the other room.\n\nKarjakin, who had stayed on the same board and won, wasn\u2019t too concerned, but he also didn\u2019t appear to feel it had been necessary:\n\nI was completely fine with that. It doesn\u2019t happen many times in my life and I would love to continue actually, but ok, it\u2019s his decision. I have nothing against it, but there were no problems.\n\nGetting back to"} +{"output_text": " am v\u0103zut cum oamenii se \u00eembr\u0103cau \u00een haine de protec\u0163ie, pentru c\u0103 nu erau \u00een siguran\u0163\u0103. Acum, \u00een aceast\u0103 perioad\u0103, am v\u0103zut cum oamenii se \u00eembrac\u0103 \u00een haine de protec\u0163ie, pentru c\u0103 nu sunt \u00een siguran\u0163\u0103. Este o situa\u0163ie foarte grav\u0103\u201d, a declarat pentru AFP,", "input_text": " Un lucru observat la protestele din Australia este diversitatea extraordinar\u0103 a participan\u0163ilor - pensionari, tineri p\u0103rin\u0163i, \u015fcolari \u00een uniforme, grupuri religioase, echipe sportive, iar lista continu\u0103.\n\nAsia de Sud-Est, \u00eenecat\u0103 \u00een abur \u015fi fum\n\nDac\u0103 Asia de Sud-Est mai avea nevoie de vreun indiciu c\u0103 pericolele schimb\u0103rilor climatice sunt reale, regiunea se confrunt\u0103 \u00een aceast\u0103 perioad\u0103 cu cel mai r\u0103u caz de \u201enegur\u0103\u201d. \u201eCea\u0163a\u201d care acoper\u0103 marile metropole asiatice este o combina\u0163ie de fum gros, galben, teribil de nes\u0103n\u0103tos, care provine de la incendiile ilegale de p\u0103dure.\n\n\u00cen Indonezia \u015fi Malaezia \u015fcolile au fost \u00eenchise iar avertiz\u0103ri de pericol asupra s\u0103n\u0103t\u0103\u0163ii au fost emise de autorit\u0103\u0163i. Aerul dens \u015fi brun care st\u0103 \u00een aceste zile deasupra insulei Borneo este rezultatul arderii pe scar\u0103 larg\u0103 a p\u0103durilor pentru a le transforma \u00een sol pentru agricultur\u0103. Milioane de oameni respir\u0103 un aer care a fost calificat de speciali\u015fti drept periculos pentru s\u0103n\u0103tatea lor.\n\nFOTO: Ulet Ifansasti/Getty Images\n\n\u201eCopil\u0103rind \u00een Jakarta, \u00een anii 1990,"} +{"output_text": " than race or religion.\n\nThe problem is that the left has no interest in the concept of sex and gender. They are interested in the concept of gender. They are interested in the concept of gender identity. They are interested in the concept of gender expression. They are interested in the concept of gender roles. They are interested in the concept of gender norms. They are interested in the concept of gender roles. They are interested in the concept of gender roles. They are interested in the concept of", "input_text": " organization on campus, contributing to every class discussion.\u201d Kaden, a manager of the campus student cafe who knew Laura casually, was upset by her words. He emailed Laura and said her response was \u201cextremely disrespectful.\u201d He continued: \u201cI am not a woman. I am a trans man who is part of your graduating class, and you literally ignored my existence in your interview.... You had an opportunity to show people that Wellesley is a place that is complicating the meaning of being an \u2018all women\u2019s school,\u2019 and you chose instead to displace a bunch of your current and past Wellesley siblings.\u201d Laura apologized, saying she hadn\u2019t meant to marginalize anyone and had actually vowed beforehand not to imply that all Wellesley students were women. But she said that under pressure, she found herself in a difficult spot: How could she maintain that women\u2019s colleges would lose something precious by including men, but at the same time argue that women\u2019s colleges should accommodate students who identify as men?\n\nI feel sorry for students like Laura. They want to be inclusive, yet they want their \u201csisterhood\u201d. The moment someone identifies as male enters the foray, that ruins the latter possibility. The simplest solution would be to require every student to at least identify as female. That would solve the immediate problem of having females transitioning to or identifying as males on campus. However, it would cause the new problem of asking the students who identify as male to leave.\n\nYet I also do not feel sorry for the students. This is precisely what happens when political correctness is left to its own devices. Accommodating transgender views about sex and gender render the very concept of sex and gender moot. If you think a person can change their gender or sex, then neither two concepts are concrete. They essentially do not exist. They are simply social constructs no different"} +{"output_text": " to that), but the most interesting part of the discussion was the way he explained the CHR's mandate.\n\n\"The CHR is not a human rights commission. It is a human rights watchdog,\" he said. \"It is a body that monitors the implementation of the human rights law. It is not a body that enforces the law. It is a body that monitors the implementation of the law.\"\n\nGascon's comments are a reminder that the CHR is not a", "input_text": "\n\n\"That gives us an advantage in terms of managing the logistics. For example, in every area, the way we work is that we have a web backend where we have the areas defined,\" he explains. \"When you choose the location and go for the slot booking, the website automatically shows the next available date and slot in your area. We basically go for concentrated orders in a small area, and that decreases the logistics cost.\"\n\nRight now, Encashea only works in certain neighbourhoods centred around East Bengaluru, areas like Marathahalli, WhiteField, Bellandur, Outer Ring Road, and CV Raman Nagar.\n\n\"We are focused on keeping it as simple as possible,\" says Chaudhari. \"In our next app version, we are looking to simplify it even further, so that the app can automatically determine the location, and enable bookings within two taps, letting the user can schedule a date and time.\"\n\nIt seems like both these startups aren't worried about competition, or someone stealing their business model. \"In the waste sector, you need to have knowledge about recyclables, you need to know the buyers and sellers, it takes a lot of expertise. That's why people have not made a big attempt on it,\" says Pom Pom's Sethi. Human rights is a complex, abstract concept, so it's no surprise that so many people are led astray about its principles and about the mandate of our Commission on Human Rights (CHR). That's probably one of the reasons Chairman Chito Gascon agreed to field questions in a town-hall style forum on Reddit. (If you may recall, Senator Antonio Trillanes IV hosted a similar Ask Me Anything session a few months ago.)\n\nGascon's funny responses have been highlighted elsewhere (he thinks Ian Veneracion should play him, if it came"} +{"output_text": " bit, but we\u2019re confident that the overall experience will be better for it.\n\nWe\u2019re also looking at ways to improve the effectiveness of some of the more popular subclasses. We\u2019re not quite ready to share details on that yet, but we\u2019ll be sure to let you know when we\u2019re ready.\n\nWe\u2019re also looking at ways to improve the effectiveness of some of the more popular subclasses. We\u2019re not quite ready to share details on", "input_text": ": damage was increased for most Supers (and slightly reduced for Blade Barrage and Nova Warp), and we\u2019ve injected a couple new tricks into some of the older subclass paths. Keep reading for a more detailed look.\n\nJosh Hamrick recently touched on our upcoming plans for Sandbox, and we wanted to expand on our intentions a bit further. These are some of the goals guiding us: As frequently as possible, continue to shift the combat meta with Sandbox updates to keep Destiny fresh\n\n\n\nInvestigate and improve outliers\u2014underperformers in terms of popularity or effectiveness\u2014to keep all of your given Sandbox options feeling like relevant and effective choices\n\n\n\nAddress major Quality of Life Sandbox issues reported by the community to maintain a smooth and consistent gameplay experience in Destiny In this specific update, our goal was to look at Supers holistically across the board, bringing up damage where they underperform, fixing some major pain points, and adding new functionality where inspiration struck. To accomplish this, we did the following: Compiled community feedback\n\n\n\nLooked at data, both usage rates and effectiveness\n\n\n\nPlaytested changes and continued adjusting\n\n\n\nSo, specifically which Supers got a bump?\n\nDamage Increases: Golden Gun (three-shot and six-shooter)\n\n\n\nShadowshot (Moebius Quiver)\n\n\n\nArc Staff\n\n\n\nBurning Maul\n\n\n\nHammer of Sol (Code of the Siegebreaker)\n\n\n\nFist of Havoc\n\n\n\nSentinel Shield\n\n\n\nNova Bomb (Cataclysm and Vortex)\n\n\n\nDaybreak\n\n\n\nSmall Damage Decrease: Blade Barrage\n\n\n\nNova Warp\n\nThis is the beginning of a process. We believe we\u2019ve found a good place around which to balance the subclasses. This means a few outliers will come down a"} +{"output_text": " time, and manner of the attack. The objective of the offensive is to defeat the enemy. The offensive is the most powerful tool for achieving victory.\n\nDefensive. \u201cDefend the objective.\u201d The defensive is the opposite of the offensive. It is to defend the objective from attack. The defensive is the most powerful tool for achieving victory.\n\nFlexibility. \u201cAdapt to the situation.\u201d The flexibility of the military is to be able to change the plan of battle as the situation", "input_text": " especially true when any one strategy is expected to solve all problems or address all causes claimed by progressives. If a movement depends more on ideological purity than it does on accomplishments, it\u2019s easy for internal sectarian arguments to take priority over getting things done. It\u2019s easier to attack resistance strategists in a burst of horizontal hostility than it is to get things together and attack those in power.\n\nThe good news is that we can learn from a few resistance groups with successful and well-articulated strategies. The study of strategy itself has been extensive for centuries. The fundamentals of strategy are foundational for military officers, as they must be for resistance cadres and leaders.\n\nPRINCIPLES OF WAR AND STRATEGY\n\nThe US Army\u2019s field manual entitled Operations introduces nine \u201cPrinciples of War.\u201d The authors emphasize that these are \u201cnot a checklist\u201d and do not apply the same way in every situation. Instead, they are characteristic of successful operations and, when used in the study of historical conflicts, are \u201cpowerful tools for analysis.\u201d The nine \u201ccore concepts\u201d are:\n\nObjective. \u201cDirect every military operation toward a clearly defined, decisive, and attainable objective.\u201d A clear goal is a prerequisite to selecting a strategy. It is also something that many resistance groups lack. The second and third requirements\u2014that the objective be both decisive and attainable\u2014are worth underlining. A decisive objective is one that will have a clear impact on the larger strategy and struggle. There is no point in going after one of questionable or little value. And, obviously, the objective itself must be attainable, because otherwise efforts toward that operation objective are a waste of time, energy, and risk.\n\nOffensive. \u201cSeize, retain, and exploit the initiative.\u201d To seize the initiative is to determine the course of battle, the place,"} +{"output_text": " freedom of assembly.105\n\nThe \u201cdemocratic\u201d form of the class struggle is now obsolete. The \u201cdictatorship of the proletariat\u201d is the only form of democracy that is now possible.\n\nThe \u201cdictatorship of the proletariat\u201d is the only form of democracy that is now possible.\n\nThe \u201cdictatorship of the proletariat\u201d is the only form of democracy that is now possible.\n\nThe \u201cdictatorship of the", "input_text": " possible only by means of the dictatorship of the party.\n\nOtherwise the Soviets would be \u201cshapeless parliaments of labor.\u201d With this phrase (though probably not thought through as such) theory descends a level: it is no longer \u201cparliamentary\u201d democracy that is impugned but any representative democracy. The power of the party, Trotsky went on to admit, is \u201csubstituted\u201d for the \u201cpower of the working class\u201d\u2014but this \u2018substitutionism\u2019 is a bit too frank and he takes part of the confession back.103\n\nIf our subject were broader, it would be necessary to continue detailing this theoretical debacle of Trotsky\u2019s, but it is perhaps enough to say that he goes on to argue for the outright \u201cmilitarization of labor.\u201d104\n\nEnough, for our purposes: we are so far from Marx\u2019s original concept of class dictatorship that there is no connection. But one last concept must be reported on: the main contribution of Bukharin\u2019s essay to the burial of democracy.\n\nWhy (Bukharin asked) were Communists formerly in favor of democracy, indeed bourgeois democracy, but are now opposed to it? Simple: it\u2019s the difference in the \u201cepoch.\u201d In the past we had to present our \u201cclass demands\u201d in \u201ca \u2018democratic\u2019 form,\u201d but now we are free to speak our true mind. In the past the \u201cproletariat\u201d\n\nwas forced to demand, not freedom of assembly for workers, but freedom of assembly in general \u2026 freedom of the press in general... etc. But there is no need to make a virtue of necessity. Now that the time has come for a direct assault on the capitalist fortress and the suppression of the exploiters, only a miserable petty-bourgeois can be content with arguments about"} +{"output_text": " since the election.\n\n\u201cI don\u2019t know why he didn\u2019t tell me,\u201d Derek said. \u201cI don\u2019t know why he didn\u2019t tell me. I don\u2019t know why he didn\u2019t tell me. I don\u2019t know why he didn\u2019t tell me. I don\u2019t know why he didn\u2019t tell me. I don\u2019t know why he didn\u2019t tell me. I don\u2019t know why he didn\u2019t tell", "input_text": " it appears the NA attempted to deposit Ratliff\u2019s \u201cgift\u201d into the group\u2019s bank account without Streed\u2019s endorsement.\n\nIt remains unclear if Ratliff ever issued another check to Streed to replace the one the NA tried to deposit.\n\nWhat is clear is that Derek Black secretly received $125,000 from the estate of Stormfront\u2019s biggest donor the day after he denounced Stormfront and everything it represented. And despite publicly denouncing his son for leaving the racist movement, Don Black was fully aware that Derek received the bequest and had cashed the check.\n\nIn a series of text and email exchanges with Derek over the last four weeks, he explained how he came to receive the Ratliff bequest, though he could not explain why there was never an announcement of Ratliff\u2019s passing on Stormfront, which has a special section devoted to user recollections about deceased forum members, called \u201cThe Eternal Flame.\u201d\n\n\n\nBetween March 2014 and June 2015, Stormfront\u2019s Alexa web traffic rank dropped by 70 percent. Meanwhile, The Daily Stormer\u2019s Alexa traffic ranking increased 570% over the past 3 years. Daily Stormer overtook Stormfront in traffic rank to become the most-visited neo-Nazi Internet site earlier this summer.\n\nSo, why the secrecy? Derek confirmed his father was aware he received the bequest. When it was pointed out that $125,000 was more money than Stormfront usually received in a year and a half of collecting donations (see chart) and that just a few months after Derek received the inheritance Don Black was in civil court being sued for more than $16,000 in past due credit card bills, Derek claimed ignorance of his father\u2019s financial difficulties.\n\nDespite some gains from the Trump candidacy, Stormfront has been losing revenue and audience"} +{"output_text": " and he gave us that balance. He was a great player.\u201d Bruce\u2019s side have now won three of their last four games and are in the top four for the first time since the end of the 2015\u201116 season. They have a game in hand on the teams above them and, with a trip to Chelsea to come, they are in a strong position to secure a top-four finish.\n\nFacebook Twitter Pinterest Roberto Firmino celebrates after scoring Liverpool\u2019s second goal against", "input_text": " whose insistence on playing from the back was calamitous against Watford, and Unai Emery are concerned? They had flirted with danger several times before, in a near-action replay of Manchester City\u2019s ridiculous third concession against Norwich, Sokratis Papastathopoulos\u2019s sloppiness allowed Tom Cleverley to hand Watford a lifeline. Bernd Leno could be seen booting the ball long from a goal-kick, to cheers from the away fans, shortly afterwards but Ainsley Maitland-Niles almost let the home team in again later on as Arsenal tried to play short again. The new rule that allows players to receive a goal-kick inside the area has been grasped enthusiastically by many managers but it is not heresy to remind them you do not have to take advantage of it \u2013 particularly if your players are not up to the job. Nick Ames\n\n\u2022 Match report: Watford 2-2 Arsenal\n\nWatford look like their old selves now Quique S\u00e1nchez Flores is back | Nick Ames Read more\n\n3) Firmino reminiscent of Cantona\n\nHis collar is not popped like Eric Cantona and his character could not be more different from the philosophising Frenchman but Roberto Firmino reminded Steve Bruce of his former Manchester United colleague with his dismantling of Newcastle. Firmino emerged from the bench at Anfield to provide two glorious assists for Sadio Man\u00e9 and Mohamed Salah as Liverpool eventually broke a well\u2011organised Newcastle unit. \u201cCantona is as good a player as I\u2019ve ever seen,\u201d Bruce said. \u201cI haven\u2019t seen Firmino week in, week out but what he gives them is a perfect balance. Cantona gave us that. We had Giggs and Kanchelskis"} +{"output_text": " is a great introduction to machine learning and predictive analytics.\n\n5. Artificial Intelligence: A Modern Approach\n\nArtificial Intelligence: A Modern Approach is a comprehensive introduction to the field of artificial intelligence. It is a comprehensive introduction to the field of artificial intelligence. It is a comprehensive introduction to the field of artificial intelligence.\n\nThis book is a great introduction to the field of artificial intelligence. It is a great introduction to the field of artificial intelligence. It is a great introduction to the", "input_text": " the technical know-how to make sense of it all, then this is one of the best artificial intelligence books for you. Designed for people that have no background in coding or programming, it pretty much does what it says on the tin.\n\nThis artificial intelligence book provides simple and visually engaging examples, and interactive exercises to assist you in understanding concepts that may have been previously out of reach. This is an excellent Artificial Intelligence Book for beginners who want to understand the terms and get introduced to the subject.\n\n3. Introduction to Artificial Intelligence\n\nIntroduction to Artificial Intelligence book presents an introduction to the science of reasoning processes in computers, and the research approaches and results of the past two decades.\n\nYou\u2019ll find lucid, easy-to-read coverage of problem-solving methods, representation and models, game playing, automated understanding of natural languages, heuristic search theory, robot systems, heuristic scene analysis, and specific artificial-intelligence accomplishments.\n\nRelated subjects are also included: predicate-calculus theorem proving, machine architecture, psychological simulation, automatic programming, novel software techniques, industrial automation and much more.\n\nThe combination of introductory and advanced material make Introduction to Artificial Intelligence books ideal for both the layman and the student of mathematics and computer science. For anyone interested in the nature of thought, it will inspire visions of what computer technology might produce tomorrow.\n\n4. Python Machine Learning\n\nMachine learning and predictive analytics are transforming the way businesses and other organizations operate. Being able to understand trends and patterns in complex data is critical to success, becoming one of the key strategies for unlocking growth in a challenging contemporary marketplace.\n\nMany of the breakthroughs in current A.I. research come from applications of machine learning. Python has proved to be an excellent language to use to further one\u2019s knowledge in the field. This artificial intelligence book"} +{"output_text": " albums of the decade.\n\n10. Kendrick Lamar \u2013 DAMN.\n\nKendrick Lamar\u2019s DAMN is a masterpiece. It\u2019s a record that\u2019s as much about the world as it is about Kendrick Lamar, and it\u2019s a record that\u2019s as much about the world as it is about hip-hop. It\u2019s a record that\u2019s as much about the world as it is about Kendrick Lamar, and it", "input_text": " onto the scene at the start of the decade, nobody could have ever predicted the amount of impact he would have a few years down the line. Not only is To Pimp a Butterfly a sonic masterpiece in its own right, but it\u2019s also a powerful political statement which touches on topics people are \u2013 to this day \u2013 still apprehensive to discuss at the best of times. \u201cEvery n*gga is a star\u201d is the first line on opener Wesley\u2019s Theory (featuring none other than George Clinton and Thundercat), and it sets the bar high for a no-holds-barred collection of cutting edge raps, jazzy inflections and enough funk and gospel to last a lifetime. Early highlights King Kunta and These Walls are among Lamar\u2019s grooviest song to date, featuring immaculate Thundercat basslines and Kendrick\u2019s classic observational wit. Politically charged standouts The Blacker The Berry (All Things Loud\u2019s #3 Song of the Decade) and Alright might stray slightly away from the album\u2019s grooves (aside from the sax, of course), but what they lack in funk or jazz they make up for in social commentary. In fact, To Pimp a Butterfly is the perfect combination of sublime musicality and social consciousness. Lamar\u2019s a lyrical master, and on To Pimp a Butterfly he is right in his element. \u201cWe hate the po po, they gon\u2019 kill us in the streets for sure\u201d he raps on the aforementioned Alright, and it serves as one of the most powerful moments on the entire album (plus the source of a TV sample he resurrected on 2017\u2019s DAMN). Lamar isn\u2019t afraid to hold back one bit, and it\u2019s this which makes To Pimp a Butterfly one of the best"} +{"output_text": "The free zone is a great place to start a business, and it is a great place to grow a business. It is a place where you can be your own boss, and it is a place where you can be your own boss. It is a place where you can be your own boss, and it is a place where you can be your own boss.\n\nThe free zone is a great place to start a business, and it is a great place to grow a business. It is", "input_text": " in 2008, it is highly technologically in tune, and from an infrastructural point of view, it provides the best infrastructure, ease of doing business, and simplified procedures.\n\nAll free zones have some form of specialization and TwoFour54 is no different. They cater to media entities, and this makes TwoFour54 the most significant media free zone authority in Abu Dhabi. Some of the biggest names found in the industry call TwoFour54 their home, and this should stand as a great testament to the quality of their infrastructure and work ethic.\n\nThe Media Zone Authority \u2013 Abu Dhabi, regulates the more official matters of the free zone such as the licensing processes and regulatory development. It is their responsibility to ensure the 2030 economic vision of Abu Dhabi becomes a reality, at least as far as they can help to achieve it from their end.\n\nThe first thing that requires consideration when setting up a company within the free zone is just what type of company the individual wishes to form. There are a few types that are available and mentioned, and these are as follows:\n\nFree Zone Limited Liability Company (FZ-LLC): This involves the incorporation of a legal entity in the media zone;\n\nBranch: This could either be a branch of an Abu Dhabi mainland entity or an object incorporated in another jurisdiction (foreign or UAE);\n\nSole Proprietor (Freelancer): This allows for an individual to work in their capacity as a media professional.\n\nThere are many incentives and draws to set up within TwoFour54. Some of these include:\n\n100% foreign ownership of the company;\n\n0% import tariffs;\n\n100% exemption from corporate and personal income tax;\n\nAccess to Abu Dhabi Film Commission\u2019s 30% cash rebate on production spend.\n\n"} +{"output_text": " structures are still being discovered, and the number of known compounds is growing at a rate of around one new compound every two days.\n\nThe structure of benzene was the first of many discoveries that led to the modern understanding of the periodic table. The table itself was first published in 1869, and the first element was named after the Greek word for \u2018heavenly\u2019 (heavenly gas, helium).\n\nThe table was originally published in 1869, and the first element was", "input_text": " system found itself increasingly unable to cope. The new chemicals being discovered weren\u2019t simple salts or compounds of two elements anymore, but seemingly-infinite combinations of just a few elements, chiefly carbon and hydrogen.\n\nIt was up to German chemist August Kekul\u00e9 to introduce the system used today, stipulating that compounds are made up of atoms in well-defined locations, joined to each other by bonds represented by lines, and where each element must form a certain number of bonds through single, double or triple bonding.\n\nIn the absence of modern analytical techniques to determine the structure of compounds, chemists at the time could know only a compound\u2019s constituent elements by burning it and weighing the carbon, hydrogen, oxygen and nitrogen-containing combustion products. With careful consideration of the reactions they could perform, Kekul\u00e9\u2019s structural theory allowed them to deduce the structure of many compounds known at the time.\n\nKekul\u00e9 is most famous today for his work on benzene, a ring-shaped compound containing six carbon and hydrogen atoms, whose structure had been a mystery for a long time. While dozing by the fireside, Kekul\u00e9 had a day-dream of a snake biting its own tail, and suddenly realised that the benzene molecule could itself form just such a ring. Like Coleridge\u2019s Kubla Khan, this \u2018day-dream\u2019 may have been aided by opium, the inspirational properties of which have been immortalised in the English phrase \u201cpipe dream\u201d.\n\nKekul\u00e9\u2019s greatest discovery, however, was suggesting that the benzene molecule rapidly alternated between two forms, shown above, with the double bonds constantly changing places. We now know that all of benzene\u2019s bonds are equal, and can be considered somewhere between a single and double bond.\n\nEven after all these years, chemical"} +{"output_text": "Gitter Chat
  • Twitter
  • News
  • Installation
  • \n\nAs in every Vue.js 2 single-file component the App implementation is split up into three parts:\n\n : Component\u2019s template code\n\n: Component\u2019s template code : Component\u2019s script code\n\n: Component\u2019s script code : Component\u2019 CSS code\n\nLet\u2019s focus on the the first two sections template and script. The script section is making a default export of an object declaraing the component named app. Again, the components property is used to declare that another component (Hello) is required by App. This subcomponent is used in the template code of app and implemented in file hello.vue in folder components. In order to be able to use the Hello component in App it\u2019s also needed to include the corresponding import statement on top of the script section.\n\nThe implementation of component Hello looks like the following:\n\n