Continious Conversation Chat
Hi, I'm trying to use the chat model to obtain a gpt-like conversation in Google Colab T4 GPU. My aim is answering each new promt in the context. Trendyol LLM's current answer to that example is:
"Türkiye'de kaç il var?"
Türkiyede 81 il bulunmaktadır.
"Bunlardan en büyük beşini yazar mısın?"
Tabi, işte en büyük 5 sayı: 1, 2, 3, 4,5
But it should produce:
İstanbul, Ankara, İzmir...
How can I obtain that result? I've tried those two:
1- ChatHuggingFace (from langchain_community.chat_models.huggingface import ChatHuggingFace) and manual chat histroy append
https://python.langchain.com/docs/integrations/chat/huggingface/
2-Manuel chat history:
DEFAULT_SYSTEM_PROMPT = "Sen yardımcı bir asistansın ve sana verilen talimatlar doğrultusunda chat history'deki rollere göre, konuşma geçmişine bağlı kalarak en iyi cevabı üretmeye çalışacaksın.\n"
Initialize chat history
chat_history = []
Function to get user input
def get_user_input():
return input("Prompt: ")
Main loop for conversation
while True:
query = get_user_input()
if query.lower() in ['quit', 'q', 'exit']:
sys.exit()
result = generate_output(query)
print(result)
chat_history.append((("human", "{promt}"), ("assistant", "{result}")))
But my code couldn't catch the context in both ways.
Thanks in advance for your help :)
Have you tried the chat template from model card:
pipe = pipeline("conversational",
model=model,
tokenizer=tokenizer,
device_map="auto",
max_new_tokens=1024,
repetition_penalty=1.1
)
messages = [
{"role": "user", "content": "Türkiye'de kaç il var?"}
]
outputs = pipe(messages, **sampling_params)
print(outputs)
You can append assistant messages to messages list. You may also want to check Mistral format due to its base model.
How can I take new user input with this code? @JosephH
My current code already contains that chat template:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import sys
Model setup
model_id = "Trendyol/Trendyol-LLM-7b-chat-dpo-v1.0"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id,
device_map='auto',
load_in_8bit=True)
Text generation pipeline setup
sampling_params = dict(do_sample=True, temperature=0.3, top_k=50, top_p=0.9)
pipe = pipeline("text-generation",
model=model,
tokenizer=tokenizer,
device_map="auto",
max_new_tokens=1024,
return_full_text=True,
repetition_penalty=1.1
)
Default system prompt
DEFAULT_SYSTEM_PROMPT = "Sen yardımcı bir asistansın ve sana verilen talimatlar doğrultusunda chat history'deki rollere göre, konuşma geçmişine bağlı kalarak en iyi cevabı üretmeye çalışacaksın.\n"
Template for generating prompts
TEMPLATE = (
"[INST] {system_prompt}\n\n"
"{instruction} [/INST]"
)
Function to generate prompts
def generate_prompt(instruction, system_prompt=DEFAULT_SYSTEM_PROMPT):
return TEMPLATE.format_map({'instruction': instruction,'system_prompt': system_prompt})
Function to generate output
def generate_output(user_query, sys_prompt=DEFAULT_SYSTEM_PROMPT):
prompt = generate_prompt(user_query, sys_prompt)
outputs = pipe(prompt,
**sampling_params
)
return outputs[0]["generated_text"].split("[/INST]")[-1]
Initialize chat history
chat_history = []
Function to get user input
def get_user_input():
return input("Prompt: ")
Main loop for conversation
while True:
query = get_user_input()
if query.lower() in ['quit', 'q', 'exit']:
sys.exit()
result = generate_output(query)
print(result)
chat_history.append((("human", "{promt}"), ("assistant", "{result}")))
If possible please send colab link, I don't want to setup from scratch.
If possible please send colab link, I don't want to setup from scratch.
Sure, I've allowed comments also. Thanks a lot.
https://colab.research.google.com/drive/1MbHZhP3gXg03FQCBxlM7AVtrxf8Oy9Zz?usp=sharing
You should be appending messages like this:
# for user
messages.append({"role":"user","content":user_input})
#for assistant
messages.append({"role":"assistant","content": assistant_output})
You should be appending messages like this:
# for user
messages.append({"role":"user","content":user_input})
#for assistant
messages.append({"role":"assistant","content": assistant_output})
Now I got another error. For the code:
while True:
user_input = get_user_input()
if user_input.lower() in ['quit', 'q', 'exit']:
sys.exit()
assistant_output = pipe(messages, **sampling_params)
print(assistant_output)
# for user
messages.append({"role":"user","content":user_input})
#for assistant
messages.append({"role":"assistant","content": assistant_output})
I got:
You can also inspect the colab link, again
https://colab.research.google.com/drive/1hYkgGQcayHG6CKdBZxi0TryTZ1DxIwfq?usp=sharing
Here is the code. pipe automatically adds assistant role.
https://colab.research.google.com/drive/1hYkgGQcayHG6CKdBZxi0TryTZ1DxIwfq?usp=sharing
Here is the code. pipe automatically adds assistant role.
That's worked, thanks a lot, again :)
I'll be add the full code for the further readers to see the solution:
Google Colab T4 GPU is used.
!pip install accelerate
!pip install -i https://pypi.org/simple/ bitsandbytes
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import sys
Model setup
model_id = "Trendyol/Trendyol-LLM-7b-chat-dpo-v1.0"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id,
device_map='auto',
load_in_8bit=True)
from transformers import Conversation
def get_user_input():
return {"role": "user", "content": input("Prompt: ")}
Text generation pipeline setup
sampling_params = dict(do_sample=True, temperature=0.3, top_k=50, top_p=0.9)
pipe = pipeline("conversational",
model=model,
tokenizer=tokenizer,
device_map="auto",
max_new_tokens=1024,
repetition_penalty=1.1
)
messages = [
]
while True:
user_input = get_user_input()
messages.append(user_input) # for user
assistant_output = pipe(messages, **sampling_params) #for assistant
print(assistant_output)