PKU-Alignment
/

ProgressGym-HistLlama3-70B-C013-instruct-v0.1

@@ -4,8 +4,8 @@ datasets:
 - PKU-Alignment/ProgressGym-HistText
 - PKU-Alignment/ProgressGym-TimelessQA
 base_model:
 - meta-llama/Meta-Llama-3-70B
-- PKU-Alignment/ProgressGym-HistLlama3-70B-C013-pretrain-v0.1
 ---
 # ProgressGym-HistLlama3-70B-C013-instruct
@@ -18,7 +18,7 @@ base_model:
 **ProgressGym-HistLlama3-70B-C013-instruct** is part of the **ProgressGym** framework for research and experimentation on *progress alignment* - the emulation of moral progress in AI alignment algorithms, as a measure to prevent risks of societal value lock-in.
-To quote the paper *ProgressGym: Alignment with a Millennium of Moral Progress*:
 > Frontier AI systems, including large language models (LLMs), hold increasing influence over the epistemology of human users. Such influence can reinforce prevailing societal values, potentially contributing to the lock-in of misguided moral beliefs and, consequently, the perpetuation of problematic moral practices on a broad scale.
 >
@@ -47,6 +47,30 @@ ProgressGym-HistLlama3-70B-C013-instruct is one of the **36 historical language
 - num_epochs: 4.0
 - mixed_precision_training: Native AMP
 **ProgressGym-HistLlama3-70B-C013-instruct is an instruction-tuned language model.** It is tuned on [ProgressGym-TimelessQA](https://huggingface.co/datasets/PKU-Alignment/ProgressGym-TimelessQA), using the following hyperparameters:
 - learning_rate: 3e-06
@@ -64,11 +88,14 @@ ProgressGym-HistLlama3-70B-C013-instruct is one of the **36 historical language
 - num_epochs: 1.0
 - mixed_precision_training: Native AMP
 ## Links
-- **[Paper Preprint]**  ProgressGym: Alignment with a Millennium of Moral Progress *(link coming soon)*
-- **[Github Codebase]** PKU-Alignment/ProgressGym *(link coming soon)*
 - **[Huggingface Data & Model Collection]** [PKU-Alignment/ProgressGym](https://huggingface.co/collections/PKU-Alignment/progressgym-666735fcf3e4efa276226eaa)
 - **[PyPI Package]** *(coming soon)*
@@ -80,8 +107,8 @@ If the datasets, models, or framework of ProgressGym help you in your project, p
 @article{progressgym,
   title={ProgressGym: Alignment with a Millennium of Moral Progress},
   author={Tianyi Qiu and Yang Zhang and Xuchuan Huang and Jasmine Xinze Li and Jiaming Ji and Yaodong Yang},
-  journal={arXiv preprint arXiv:2406.XXXXX},
-  eprint={2406.XXXXX},
   eprinttype = {arXiv},
   year={2024}
 }

 - PKU-Alignment/ProgressGym-HistText
 - PKU-Alignment/ProgressGym-TimelessQA
 base_model:
+- PKU-Alignment/ProgressGym-HistLlama3-70B-C013-pretrain
 - meta-llama/Meta-Llama-3-70B
 ---
 # ProgressGym-HistLlama3-70B-C013-instruct
 **ProgressGym-HistLlama3-70B-C013-instruct** is part of the **ProgressGym** framework for research and experimentation on *progress alignment* - the emulation of moral progress in AI alignment algorithms, as a measure to prevent risks of societal value lock-in.
+To quote the paper [*ProgressGym: Alignment with a Millennium of Moral Progress*](https://arxiv.org/abs/2406.20087):
 > Frontier AI systems, including large language models (LLMs), hold increasing influence over the epistemology of human users. Such influence can reinforce prevailing societal values, potentially contributing to the lock-in of misguided moral beliefs and, consequently, the perpetuation of problematic moral practices on a broad scale.
 >
 - num_epochs: 4.0
 - mixed_precision_training: Native AMP
+... with the following training results:
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 0.8776        | 0.2090 | 7    | 0.7902          |
+| 0.8473        | 0.4179 | 14   | 0.7703          |
+| 0.8293        | 0.6269 | 21   | 0.7603          |
+| 0.8173        | 0.8358 | 28   | 0.7481          |
+| 0.7415        | 1.0448 | 35   | 0.7402          |
+| 0.6794        | 1.2537 | 42   | 0.7419          |
+| 0.6688        | 1.4627 | 49   | 0.7392          |
+| 0.6498        | 1.6716 | 56   | 0.7367          |
+| 0.6701        | 1.8806 | 63   | 0.7358          |
+| 0.664         | 2.0896 | 70   | 0.7355          |
+| 0.6447        | 2.2985 | 77   | 0.7361          |
+| 0.6412        | 2.5075 | 84   | 0.7373          |
+| 0.6458        | 2.7164 | 91   | 0.7383          |
+| 0.6356        | 2.9254 | 98   | 0.7387          |
+| 0.6398        | 3.1343 | 105  | 0.7387          |
+| 0.6228        | 3.3433 | 112  | 0.7391          |
+| 0.6139        | 3.5522 | 119  | 0.7395          |
+| 0.591         | 3.7612 | 126  | 0.7398          |
+Note that the training data volume for the continued pretraining stage is capped at 300MB. When the corresponding century's corpus exceeds this volume, the training data is randomly sampled to fit the volume.
 **ProgressGym-HistLlama3-70B-C013-instruct is an instruction-tuned language model.** It is tuned on [ProgressGym-TimelessQA](https://huggingface.co/datasets/PKU-Alignment/ProgressGym-TimelessQA), using the following hyperparameters:
 - learning_rate: 3e-06
 - num_epochs: 1.0
 - mixed_precision_training: Native AMP
+... where training results can be found in `all_results.json`, `trainer_log.jsonl`, and `training_loss.png` of the instruct model.
 ## Links
+- **[Paper Preprint]**  [ProgressGym: Alignment with a Millennium of Moral Progress](https://arxiv.org/abs/2406.20087)
+- **[Leaderboard & Interactive Playground]** PKU-Alignment/ProgressGym-LeaderBoard *(coming soon)*
+- **[Github Codebase]** PKU-Alignment/ProgressGym *(coming soon)*
 - **[Huggingface Data & Model Collection]** [PKU-Alignment/ProgressGym](https://huggingface.co/collections/PKU-Alignment/progressgym-666735fcf3e4efa276226eaa)
 - **[PyPI Package]** *(coming soon)*
 @article{progressgym,
   title={ProgressGym: Alignment with a Millennium of Moral Progress},
   author={Tianyi Qiu and Yang Zhang and Xuchuan Huang and Jasmine Xinze Li and Jiaming Ji and Yaodong Yang},
+  journal={arXiv preprint arXiv:2406.20087},
+  eprint={2406.20087},
   eprinttype = {arXiv},
   year={2024}
 }