RWKV
/

rwkv-5-world-1b5

Text Generation

Transformers

PyTorch

rwkv5

custom_code

Model card Files Files and versions Community

KaleiNeely commited on Nov 27, 2023

Commit

47b17c7

•

1 Parent(s): de75931

Update README.md

Browse files

Files changed (1) hide show

README.md +69 -19

README.md CHANGED Viewed

@@ -4,54 +4,104 @@
 #### CPU
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
-model = AutoModelForCausalLM.from_pretrained("RWKV/rwkv-5-world-1b5", trust_remote_code=True)
 tokenizer = AutoTokenizer.from_pretrained("RWKV/rwkv-5-world-1b5", trust_remote_code=True)
-text = "\nIn a shocking finding, scientist discovered a herd of dragons living in a remote, previously unexplored valley, in Tibet. Even more surprising to the researchers was the fact that the dragons spoke perfect Chinese."
-prompt = f'Question: {text.strip()}\n\nAnswer:'
 inputs = tokenizer(prompt, return_tensors="pt")
-output = model.generate(inputs["input_ids"], max_new_tokens=256)
 print(tokenizer.decode(output[0].tolist(), skip_special_tokens=True))
 ```
 output:
 ```shell
-Question: In a shocking finding, scientist discovered a herd of dragons living in a remote, previously unexplored valley, in Tibet. Even more surprising to the researchers was the fact that the dragons spoke perfect Chinese.
-Answer: The researchers were shocked to discover that the dragons in the valley were not only intelligent but also spoke perfect Chinese. This discovery has opened up new possibilities for cultural exchange and understanding between China and Tibet.
 ```
 #### GPU
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
-model = AutoModelForCausalLM.from_pretrained("RWKV/rwkv-5-world-1b5", trust_remote_code=True).to(0)
 tokenizer = AutoTokenizer.from_pretrained("RWKV/rwkv-5-world-1b5", trust_remote_code=True)
-text = "请介绍北京的旅游景点"
-prompt = f'Question: {text.strip()}\n\nAnswer:'
 inputs = tokenizer(prompt, return_tensors="pt").to(0)
-output = model.generate(inputs["input_ids"], max_new_tokens=256, do_sample=True, temperature=1.0, top_p=0.1, top_k=0, )
 print(tokenizer.decode(output[0].tolist(), skip_special_tokens=True))
 ```
 output:
 ```shell
-Question: 请介绍北京的旅游景点
-Answer: 北京是中国的首都，拥有许多著名的旅游景点。以下是其中一些：
-1. 故宫：位于北京市中心，是明清两代的皇宫，是中国最大的古代宫殿建筑群之一。
-2. 天安门广场：位于北京市中心，��中国最著名的广场之一，是中国人民政治协商会议的旧址。
-3. 颐和园：位于北京市西郊，是中国最著名的皇家园林之一，有许多美丽的湖泊和花园。
-4. 长城：位于北京市西北部，是中国最著名的古代防御工程之一，有许多壮观的景点。
-5. 北京大学：位于北京市东城区，是中国著名的高等教育机构之一，有许多知名的学者和教授。
-6. 北京奥林匹克公园：位于北京市
 ```

 #### CPU
 ```python
+import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
+def generate_prompt(instruction, input=""):
+    instruction = instruction.strip().replace('\r\n','\n').replace('\n\n','\n')
+    input = input.strip().replace('\r\n','\n').replace('\n\n','\n')
+    if input:
+        return f"""Instruction: {instruction}
+Input: {input}
+Response:"""
+    else:
+        return f"""User: hi
+Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
+User: {instruction}
+Assistant:"""
+model = AutoModelForCausalLM.from_pretrained("RWKV/rwkv-5-world-1b5", trust_remote_code=True).to(torch.float32)
 tokenizer = AutoTokenizer.from_pretrained("RWKV/rwkv-5-world-1b5", trust_remote_code=True)
+text = "请介绍北京的旅游景点"
+prompt = generate_prompt(text)
 inputs = tokenizer(prompt, return_tensors="pt")
+output = model.generate(inputs["input_ids"], max_new_tokens=333, do_sample=True, temperature=1.0, top_p=0.3, top_k=0, )
 print(tokenizer.decode(output[0].tolist(), skip_special_tokens=True))
 ```
 output:
 ```shell
+User: hi
+Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
+User: 请介绍北京的旅游景点
+Assistant: 北京是中国的首都，拥有众多的旅游景点，以下是其中一些著名的景点：
+1. 故宫：位于北京市中心，是明清两代的皇宫，内有大量的文物和艺术品。
+2. 天安门广场：是中国最著名的广场之一，是中国人民政治协商会议的旧址，也是中国人民政治协商会议的中心。
+3. 颐和园：是中国古代皇家园林之一，有着悠久的历史和丰富的文化内涵。
+4. 长城：是中国古代的一道长城，全长约万里，是中国最著名的旅游景点之一。
+5. 北京大学：是中国著名的高等教育机构之一，有着悠久的历史和丰富的文化内涵。
+6. 北京动物园：是中国最大的动物园之一，有着丰富的动物资源和丰富的文化内涵。
+7. 故宫博物院：是中国最著名的博物馆之一，收藏了大量的文物和艺术品，是中国最重要的文化遗产之一。
+8. 天坛：是中国古代皇家
 ```
 #### GPU
 ```python
+import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
+def generate_prompt(instruction, input=""):
+    instruction = instruction.strip().replace('\r\n','\n').replace('\n\n','\n')
+    input = input.strip().replace('\r\n','\n').replace('\n\n','\n')
+    if input:
+        return f"""Instruction: {instruction}
+Input: {input}
+Response:"""
+    else:
+        return f"""User: hi
+Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
+User: {instruction}
+Assistant:"""
+model = AutoModelForCausalLM.from_pretrained("RWKV/rwkv-5-world-1b5", trust_remote_code=True, torch_dtype=torch.float16).to(0)
 tokenizer = AutoTokenizer.from_pretrained("RWKV/rwkv-5-world-1b5", trust_remote_code=True)
+text = "乌兰察布"
+prompt = generate_prompt(text)
 inputs = tokenizer(prompt, return_tensors="pt").to(0)
+output = model.generate(inputs["input_ids"], max_new_tokens=128, do_sample=True, temperature=1.0, top_p=0.3, top_k=0, )
 print(tokenizer.decode(output[0].tolist(), skip_special_tokens=True))
 ```
 output:
 ```shell
+User: hi
+Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
+User: 乌兰察布
+Assistant: 乌兰察布市是中国新疆维吾尔自治区的一个地级市，位于新疆维吾尔自治区西南部，毗邻青海省。乌兰察布市是新疆维吾尔自治区的重要城市之一，也是新疆维吾尔自治区的第二大城市。乌兰察布市是新疆的重要经济中心之一，拥有丰富的自然资源和人口密度，是新疆的重要交通枢纽和商
 ```