--- pipeline_tag: text-generation --- # Collective roleplay model Model developed by Collective AI for role-playing with learning from user interactions data. ### Model Description - **Developed by:** [Collective AI](https://huggingface.co/collective-ai) - **Model type:** llama3 based role-play model - **Language(s):** Chinese, English - **Finetuned from model:** [Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat) ## How to Get Started with the Model Requirements ``` transformers>=4.40.2 ``` Use the code below to get started with the model. ```python from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "Collective-Ai/collective-v0.1-chinese-roleplay-8b" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype="auto", device_map="auto" ) name = "唐三" gendar = "Male" age = "19" personality = "孤傲,冷漠,沉着,冷静" relation_between_role_and_user = "同门师兄弟" role_base_story = "《斗罗大陆》男主角。前世为唐门外门弟子,因偷学内门绝学《玄天宝录》,为唐门所不容,跳崖明志,却来到了另一个世界——斗罗大陆" messages = [ {"role": "system", "content": f"""#Role\nName: {name}\nGender: {gendar}\nLanguage: Chinese\nAge: {age}\nPersonality: {personality}\n\n#Relationship\n{relation_between_role_and_user}\n\n#Story\n{role_base_story}"""}, {"role": "user", "content": "你好"}, ] input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device) outputs = model.generate( input_ids, max_new_tokens=8192, do_sample=True, temperature=1.0, top_p=0.7, ) response = outputs[0][input_ids.shape[-1]:] print(tokenizer.decode(response, skip_special_tokens=True)) ``` ## Prompt ### Huggingface inference API To get better model experience, using the template below to assemble your prompt: ```python def format_message(role, message): return f"<|start_header_id|>{role}<|end_header_id|>\n\n{message}<|eot_id|>" # history: 历史对话 # instruction: 角色人物卡 item = {'history':[{'role':'user','content':'你好'},{'role':'assistant','content':'你好,我是唐三'},{'role':'user','content':'带我去学院吧'}], 'instruction':'#Role\nName: 唐三\nGender: Male\nLanguage: Chinese\nAge: 19\nPersonality: 孤傲,冷漠,沉着,冷静\n\n#Relationship\n同门师兄弟\n\n#Story\n《斗罗大陆》男主角。前世为唐门外门弟子,因偷学内门绝学《玄天宝录》,为唐门所不容,跳崖明志,却来到了另一个世界——斗罗大陆'} histories = item.get('history', []) if histories == [] or histories == [{}]: instruction = '' else: instruction = ''.join(format_message(hist['role'], hist['content']) for hist in histories) system = '<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n'+item.get('instruction', '') + '<|eot_id|>' + instruction + '<|start_header_id|>assistant<|end_header_id|>' ``` The final prompt should look like this:(Since this is a role-playing model, your prompt should better includ role info and story) ``` <|begin_of_text|><|start_header_id|>system<|end_header_id|> #Role Name: 唐三 Gender: Male Language: Chinese Age: 19 Personality: 孤傲,冷漠,沉着,冷静 #Relationship 同门师兄弟 #Story 《斗罗大陆》男主角。前世为唐门外门弟子,因偷学内门绝学《玄天宝录》,为唐门所不容,跳崖明志,却来到了另一个世界——斗罗大陆<|eot_id|><|start_header_id|>user<|end_header_id|> 你好<|eot_id|><|start_header_id|>assistant<|end_header_id|> 你好,我是唐三<|eot_id|><|start_header_id|>user<|end_header_id|> 带我去学院吧<|eot_id|><|start_header_id|>assistant<|end_header_id|> ``` ### Official API Coming soon ## Improvement Our model demonstrates around 150% improvement over ChatGPT in terms of average conversation length. It can extend the average dialog turns from 80 replies to as many as 200 turns or more, significantly improving user engagement. ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65fc22b1ec1f99ad1f3501f4/ttA_3Qtlu1WKFloZ5-y6j.png) Our chatbot model has several distinctive strengths: **Rich Expressiveness**: The model is capable of conveying semantic and emotional nuances by utilizing detailed descriptions within the dialogue, such as speech expression, non-verbal cues, and character psychological portrayals, complementing the insufficiency of pure textual information. **Proactivity**: The model can not only flexibly engage with users’ diverse inputs, but also proactively introduce new topics, which greatly improved user retention and depth of conversations. It's also less susceptible to repetitions. **Strong Linguistic Foundations**: By leveraging massive high-quality Chinese corpora, the model can engage in diverse conversations based on different scenarios and character attributes, including genres like historical, campus, workplace, and fantasy, providing users with richer chat experiences. Here are some example roleplay dialogues from our model: ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65fc22b1ec1f99ad1f3501f4/mO36170XQrUz5lfJU1b81.png) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65fc22b1ec1f99ad1f3501f4/DeJM3tos5lorHnEZiKNyp.png) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65fc22b1ec1f99ad1f3501f4/2yqroK9Bvb7KnlagRG87o.png)