qixun
/

bert-chinese-poem

Inference Endpoints

Model card Files Files and versions Community

bert-chinese-poem / README.md

qixun's picture

Update README.md

b09569b verified 7 months ago

|

history blame contribute delete

1.35 kB

	---
	license: gpl-3.0

	widget:
	- text: "宵凉百念集孤[MASK]，暗雨鸣廊睡未能。生计坐怜秋一叶，归程冥想浪千层。寒心国事浑难料，堆眼官资信可憎。此去梦中应不忘，顺承门内近觚棱。"
	---
	适用于中国古典诗歌的bert模型，在搜韵开源的语料上以16的batch_size训练了110万步左右，loss稳定低于1。

	使用方法如下：

	```python
	from transformers import BertTokenizer, BertForMaskedLM
	import torch

	# 加载分词器
	tokenizer = BertTokenizer.from_pretrained("qixun/bert-chinese-poem")

	# 加载模型
	model = BertForMaskedLM.from_pretrained("qixun/bert-chinese-poem")

	# 输入文本
	text = "宵凉百念集孤[MASK]，暗雨鸣廊睡未能。生计坐怜秋一叶，归程冥想浪千层。寒心国事浑难料，堆眼官资信可憎。此去梦中应不忘，顺承门内近觚棱。"

	# 分词
	inputs = tokenizer(text, return_tensors="pt")

	# 模型推理
	with torch.no_grad():
	outputs = model(**inputs)

	# 获取[MASK]标记的位置
	mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1]

	# 获取预测的token_id
	predicted_token_id = outputs.logits[0, mask_token_index].argmax(axis=-1).item()

	# 获取预测的词
	predicted_token = tokenizer.decode([predicted_token_id])

	print(f"预测的词是：{predicted_token}")

	```