IDEA-CCNL
/

Wenzhong2.0-GPT2-3.5B-chinese

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Wenzhong2.0-GPT2-3.5B-chinese / README.md

Zimix's picture

Update README.md

8166fbd over 2 years ago

|

1.67 kB

	---
	language:
	- zh

	inference:
	parameters:
	max_new_tokens: 250
	repetition_penalty: 1.1
	top_p: 0.9
	do_sample: True



	license: apache-2.0
	---
	# Wenzhong2.0-GPT2-3.5B model (chinese)，one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).
	As we all know, the single direction language model based on decoder structure has strong generation ability, such as GPT model. The 3.5 billion parameter Wenzhong-GPT2-3.5B large model, using 100G chinese common data, 32 A100 training for 28 hours, is the largest open source GPT2 large model of chinese. Our model performs well in Chinese continuation generation. Wenzhong2.0-GPT2-3.5B-Chinese is a Chinese gpt2 model trained with cleaner data on the basis of Wenzhong-GPT2-3.5B.

	## Usage

	### load model
	```python
	from transformers import GPT2Tokenizer, GPT2Model
	tokenizer = GPT2Tokenizer.from_pretrained('IDEA-CCNL/Wenzhong2.0-GPT2-3.5B-chinese')
	model = GPT2Model.from_pretrained('IDEA-CCNL/Wenzhong2.0-GPT2-3.5B-chinese')
	text = "Replace me by any text you'd like."
	encoded_input = tokenizer(text, return_tensors='pt')
	output = model(**encoded_input)
	```
	### generation
	```python
	from transformers import pipeline, set_seed
	set_seed(55)
	generator = pipeline('text-generation', model='IDEA-CCNL/Wenzhong2.0-GPT2-3.5B-chinese')
	generator("北京位于", max_length=30, num_return_sequences=1)

	```

	## Citation
	If you find the resource is useful, please cite the following website in your paper.
	```
	@misc{Fengshenbang-LM,
	title={Fengshenbang-LM},
	author={IDEA-CCNL},
	year={2021},
	howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
	}
	```