uer commited on
Commit
4dee18f
1 Parent(s): d611421

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -121,8 +121,8 @@ deepspeed pretrain.py --deepspeed --deepspeed_config models/deepspeed_config.jso
121
  Before stage2, we extract fp32 consolidated weights from a zero 2 and 3 DeepSpeed checkpoints:
122
 
123
  ```
124
- python3 models/cluecorpussmall_gpt2_xlarge_seq128/zero_to_fp32.py \
125
- models/cluecorpussmall_gpt2_xlarge_seq128/ models/cluecorpussmall_gpt2_xlarge_seq128.bin
126
  ```
127
 
128
  Stage2:
@@ -149,16 +149,16 @@ deepspeed pretrain.py --deepspeed --deepspeed_config models/deepspeed_config.jso
149
  Then, we extract fp32 consolidated weights from a zero 2 and 3 DeepSpeed checkpoints:
150
 
151
  ```
152
- python3 models/cluecorpussmall_gpt2_xlarge_seq1024_stage2/zero_to_fp32.py \
153
- models/cluecorpussmall_gpt2_xlarge_seq1024_stage2/ models/cluecorpussmall_gpt2_xlarge_seq1024_stage2.bin
154
  ```
155
 
156
  Finally, we convert the pre-trained model into Huggingface's format:
157
 
158
  ```
159
  python3 scripts/convert_gpt2_from_tencentpretrain_to_huggingface.py --input_model_path models/cluecorpussmall_gpt2_xlarge_seq1024_stage2.bin \
160
- --output_model_path pytorch_model.bin \
161
- --layers_num 48
162
  ```
163
 
164
  ### BibTeX entry and citation info
 
121
  Before stage2, we extract fp32 consolidated weights from a zero 2 and 3 DeepSpeed checkpoints:
122
 
123
  ```
124
+ python3 models/cluecorpussmall_gpt2_xlarge_seq128/zero_to_fp32.py models/cluecorpussmall_gpt2_xlarge_seq128/ \
125
+ models/cluecorpussmall_gpt2_xlarge_seq128.bin
126
  ```
127
 
128
  Stage2:
 
149
  Then, we extract fp32 consolidated weights from a zero 2 and 3 DeepSpeed checkpoints:
150
 
151
  ```
152
+ python3 models/cluecorpussmall_gpt2_xlarge_seq1024_stage2/zero_to_fp32.py models/cluecorpussmall_gpt2_xlarge_seq1024_stage2/ \
153
+ models/cluecorpussmall_gpt2_xlarge_seq1024_stage2.bin
154
  ```
155
 
156
  Finally, we convert the pre-trained model into Huggingface's format:
157
 
158
  ```
159
  python3 scripts/convert_gpt2_from_tencentpretrain_to_huggingface.py --input_model_path models/cluecorpussmall_gpt2_xlarge_seq1024_stage2.bin \
160
+ --output_model_path pytorch_model.bin \
161
+ --layers_num 48
162
  ```
163
 
164
  ### BibTeX entry and citation info