uer
/

gpt2-distil-chinese-cluecorpussmall

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

uer commited on Sep 4, 2023

Commit

4dee18f

•

1 Parent(s): d611421

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -121,8 +121,8 @@ deepspeed pretrain.py --deepspeed --deepspeed_config models/deepspeed_config.jso
 Before stage2, we extract fp32 consolidated weights from a zero 2 and 3 DeepSpeed checkpoints:
 ```
-python3 models/cluecorpussmall_gpt2_xlarge_seq128/zero_to_fp32.py  \
-        models/cluecorpussmall_gpt2_xlarge_seq128/ models/cluecorpussmall_gpt2_xlarge_seq128.bin
 ```
 Stage2:
@@ -149,16 +149,16 @@ deepspeed pretrain.py --deepspeed --deepspeed_config models/deepspeed_config.jso
 Then, we extract fp32 consolidated weights from a zero 2 and 3 DeepSpeed checkpoints:
 ```
-python3 models/cluecorpussmall_gpt2_xlarge_seq1024_stage2/zero_to_fp32.py \
-        models/cluecorpussmall_gpt2_xlarge_seq1024_stage2/ models/cluecorpussmall_gpt2_xlarge_seq1024_stage2.bin
 ```
 Finally, we convert the pre-trained model into Huggingface's format:
 ```
 python3 scripts/convert_gpt2_from_tencentpretrain_to_huggingface.py --input_model_path models/cluecorpussmall_gpt2_xlarge_seq1024_stage2.bin \
-                                                        --output_model_path pytorch_model.bin \
-                                                        --layers_num 48
 ```
 ### BibTeX entry and citation info

 Before stage2, we extract fp32 consolidated weights from a zero 2 and 3 DeepSpeed checkpoints:
 ```
+python3 models/cluecorpussmall_gpt2_xlarge_seq128/zero_to_fp32.py models/cluecorpussmall_gpt2_xlarge_seq128/ \
+                                                                  models/cluecorpussmall_gpt2_xlarge_seq128.bin
 ```
 Stage2:
 Then, we extract fp32 consolidated weights from a zero 2 and 3 DeepSpeed checkpoints:
 ```
+python3 models/cluecorpussmall_gpt2_xlarge_seq1024_stage2/zero_to_fp32.py models/cluecorpussmall_gpt2_xlarge_seq1024_stage2/ \
+                                                                          models/cluecorpussmall_gpt2_xlarge_seq1024_stage2.bin
 ```
 Finally, we convert the pre-trained model into Huggingface's format:
 ```
 python3 scripts/convert_gpt2_from_tencentpretrain_to_huggingface.py --input_model_path models/cluecorpussmall_gpt2_xlarge_seq1024_stage2.bin \
+                                                                    --output_model_path pytorch_model.bin \
+                                                                    --layers_num 48
 ```
 ### BibTeX entry and citation info