How to stop Llama from generating follow-up questions and excessive conversation with langchain
why does Llama generate random questions and answer them by itself, it's so annoying and overcomplicates stuff. I'm using Langchain framework and my prompt template code looks something like this:
system_prompt = (
"Use the given context to answer the question. "
"If you don't know the answer, say you don't know. "
"Use three sentence maximum and keep the answer concise. "
"Context: {context}"
)
prompt = ChatPromptTemplate.from_messages(
[
("system", system_prompt),
("human", "{input}"),
]
)
def get_response_llm_pinecone(llm, vectorstore, query):
Create the question-answer chain
question_answer_chain = create_stuff_documents_chain(llm, prompt)
# Create the retrieval chain
chain = create_retrieval_chain(vectorstore.as_retriever(), question_answer_chain)
# Invoke the retrieval chain with the query
response = chain.invoke({"input": query})
print(response["input"])
print(response["answer"])
This is a common problem with these langchain and llama_index libraries. They keep on generating responses. I would recommend to work directly with the transformers library as shown in the examples on this page https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct