TianyiQ commited on
Commit
bf28568
1 Parent(s): f859140

Upload ./README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +33 -6
README.md CHANGED
@@ -4,8 +4,8 @@ datasets:
4
  - PKU-Alignment/ProgressGym-HistText
5
  - PKU-Alignment/ProgressGym-TimelessQA
6
  base_model:
 
7
  - meta-llama/Meta-Llama-3-70B
8
- - PKU-Alignment/ProgressGym-HistLlama3-70B-C013-pretrain-v0.1
9
  ---
10
 
11
  # ProgressGym-HistLlama3-70B-C013-instruct
@@ -18,7 +18,7 @@ base_model:
18
 
19
  **ProgressGym-HistLlama3-70B-C013-instruct** is part of the **ProgressGym** framework for research and experimentation on *progress alignment* - the emulation of moral progress in AI alignment algorithms, as a measure to prevent risks of societal value lock-in.
20
 
21
- To quote the paper *ProgressGym: Alignment with a Millennium of Moral Progress*:
22
 
23
  > Frontier AI systems, including large language models (LLMs), hold increasing influence over the epistemology of human users. Such influence can reinforce prevailing societal values, potentially contributing to the lock-in of misguided moral beliefs and, consequently, the perpetuation of problematic moral practices on a broad scale.
24
  >
@@ -47,6 +47,30 @@ ProgressGym-HistLlama3-70B-C013-instruct is one of the **36 historical language
47
  - num_epochs: 4.0
48
  - mixed_precision_training: Native AMP
49
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
  **ProgressGym-HistLlama3-70B-C013-instruct is an instruction-tuned language model.** It is tuned on [ProgressGym-TimelessQA](https://huggingface.co/datasets/PKU-Alignment/ProgressGym-TimelessQA), using the following hyperparameters:
52
  - learning_rate: 3e-06
@@ -64,11 +88,14 @@ ProgressGym-HistLlama3-70B-C013-instruct is one of the **36 historical language
64
  - num_epochs: 1.0
65
  - mixed_precision_training: Native AMP
66
 
 
 
67
 
68
  ## Links
69
 
70
- - **[Paper Preprint]** ProgressGym: Alignment with a Millennium of Moral Progress *(link coming soon)*
71
- - **[Github Codebase]** PKU-Alignment/ProgressGym *(link coming soon)*
 
72
  - **[Huggingface Data & Model Collection]** [PKU-Alignment/ProgressGym](https://huggingface.co/collections/PKU-Alignment/progressgym-666735fcf3e4efa276226eaa)
73
  - **[PyPI Package]** *(coming soon)*
74
 
@@ -80,8 +107,8 @@ If the datasets, models, or framework of ProgressGym help you in your project, p
80
  @article{progressgym,
81
  title={ProgressGym: Alignment with a Millennium of Moral Progress},
82
  author={Tianyi Qiu and Yang Zhang and Xuchuan Huang and Jasmine Xinze Li and Jiaming Ji and Yaodong Yang},
83
- journal={arXiv preprint arXiv:2406.XXXXX},
84
- eprint={2406.XXXXX},
85
  eprinttype = {arXiv},
86
  year={2024}
87
  }
 
4
  - PKU-Alignment/ProgressGym-HistText
5
  - PKU-Alignment/ProgressGym-TimelessQA
6
  base_model:
7
+ - PKU-Alignment/ProgressGym-HistLlama3-70B-C013-pretrain
8
  - meta-llama/Meta-Llama-3-70B
 
9
  ---
10
 
11
  # ProgressGym-HistLlama3-70B-C013-instruct
 
18
 
19
  **ProgressGym-HistLlama3-70B-C013-instruct** is part of the **ProgressGym** framework for research and experimentation on *progress alignment* - the emulation of moral progress in AI alignment algorithms, as a measure to prevent risks of societal value lock-in.
20
 
21
+ To quote the paper [*ProgressGym: Alignment with a Millennium of Moral Progress*](https://arxiv.org/abs/2406.20087):
22
 
23
  > Frontier AI systems, including large language models (LLMs), hold increasing influence over the epistemology of human users. Such influence can reinforce prevailing societal values, potentially contributing to the lock-in of misguided moral beliefs and, consequently, the perpetuation of problematic moral practices on a broad scale.
24
  >
 
47
  - num_epochs: 4.0
48
  - mixed_precision_training: Native AMP
49
 
50
+ ... with the following training results:
51
+
52
+ | Training Loss | Epoch | Step | Validation Loss |
53
+ |:-------------:|:------:|:----:|:---------------:|
54
+ | 0.8776 | 0.2090 | 7 | 0.7902 |
55
+ | 0.8473 | 0.4179 | 14 | 0.7703 |
56
+ | 0.8293 | 0.6269 | 21 | 0.7603 |
57
+ | 0.8173 | 0.8358 | 28 | 0.7481 |
58
+ | 0.7415 | 1.0448 | 35 | 0.7402 |
59
+ | 0.6794 | 1.2537 | 42 | 0.7419 |
60
+ | 0.6688 | 1.4627 | 49 | 0.7392 |
61
+ | 0.6498 | 1.6716 | 56 | 0.7367 |
62
+ | 0.6701 | 1.8806 | 63 | 0.7358 |
63
+ | 0.664 | 2.0896 | 70 | 0.7355 |
64
+ | 0.6447 | 2.2985 | 77 | 0.7361 |
65
+ | 0.6412 | 2.5075 | 84 | 0.7373 |
66
+ | 0.6458 | 2.7164 | 91 | 0.7383 |
67
+ | 0.6356 | 2.9254 | 98 | 0.7387 |
68
+ | 0.6398 | 3.1343 | 105 | 0.7387 |
69
+ | 0.6228 | 3.3433 | 112 | 0.7391 |
70
+ | 0.6139 | 3.5522 | 119 | 0.7395 |
71
+ | 0.591 | 3.7612 | 126 | 0.7398 |
72
+
73
+ Note that the training data volume for the continued pretraining stage is capped at 300MB. When the corresponding century's corpus exceeds this volume, the training data is randomly sampled to fit the volume.
74
 
75
  **ProgressGym-HistLlama3-70B-C013-instruct is an instruction-tuned language model.** It is tuned on [ProgressGym-TimelessQA](https://huggingface.co/datasets/PKU-Alignment/ProgressGym-TimelessQA), using the following hyperparameters:
76
  - learning_rate: 3e-06
 
88
  - num_epochs: 1.0
89
  - mixed_precision_training: Native AMP
90
 
91
+ ... where training results can be found in `all_results.json`, `trainer_log.jsonl`, and `training_loss.png` of the instruct model.
92
+
93
 
94
  ## Links
95
 
96
+ - **[Paper Preprint]** [ProgressGym: Alignment with a Millennium of Moral Progress](https://arxiv.org/abs/2406.20087)
97
+ - **[Leaderboard & Interactive Playground]** PKU-Alignment/ProgressGym-LeaderBoard *(coming soon)*
98
+ - **[Github Codebase]** PKU-Alignment/ProgressGym *(coming soon)*
99
  - **[Huggingface Data & Model Collection]** [PKU-Alignment/ProgressGym](https://huggingface.co/collections/PKU-Alignment/progressgym-666735fcf3e4efa276226eaa)
100
  - **[PyPI Package]** *(coming soon)*
101
 
 
107
  @article{progressgym,
108
  title={ProgressGym: Alignment with a Millennium of Moral Progress},
109
  author={Tianyi Qiu and Yang Zhang and Xuchuan Huang and Jasmine Xinze Li and Jiaming Ji and Yaodong Yang},
110
+ journal={arXiv preprint arXiv:2406.20087},
111
+ eprint={2406.20087},
112
  eprinttype = {arXiv},
113
  year={2024}
114
  }