Model giving out weird responses
I was trying to use this model for document question answering along with langchain. Here I am storing documents in a vector store. For a question asked, a single passage from the entire vector store is retrieved using a similarity searched. Then the passage along with a prompt containing the question is passed to the model to get the generated response.
Here is my prompt:
[INST]<>Read the following text extracts given to you. Answer question that follows using the information provided in the text extracts. If the required information is not provided in the text extracts say that you cannot answer the question. Keep in mind that the extracts may contain irrelevant details and extra whitespaces. Avoid these whitespaces and irrelevant details and answer precisely to the question in a concise manner.<> 51) TERMINATION OPTION. Tenant shall have a one-time option to terminate the Lease for the entire 43,750 SF Premises and any additional space that may be leased hereunder or pursuant to this Lease, effective at midnight (CST) February 28, 2018 (the “Termination Date”). In order for this one-time option to be effective, Tenant must give written notice of its intention to terminate the Lease to Landlord prior to September 1, 2017 and, together with such written notice, provide to Landlord a termination payment equal to the unamortized cost of Tenant Improvements and Commissions at eight percent (8%) amortization rate equaling the amount of $95,100.89 plus five (5) additional months of Basic Monthly and Additional Rental; such Additional Rental to be reasonably approximated if not known for certain. How does the process of terminating a commercial lease agreement work?[/INST]
The passage is from a publicly available lease document.
I am getting the following response:
According to the text extract provided, the process of terminating a commercial lease agreement through the one-time option provided in the lease is as follows:\n\n1. The tenant must give written notice of their intention to terminate the lease to the landlord prior to September 1, 2017.\n2. Along with the written notice, the tenant must provide a termination payment equal to the unamortized cost of tenant improvements and commissions at an eight percent (8%) amortization rate, which is calculated as follows: $95,100.89 + five (5) additional months of basic monthly and additional rental, with the additional rental to be reasonably approximated if not known for certain.\n\nIn conclusion, based on the provided text extract, the process of the process of which outlined in order to terminate a commercial lease tenants must provide written notice must provide the process of terminating a commercial lease the process of the process of the process of the process of the process of the process of the process of the process of terminating a commercial lease, the process of terminating a tenants must give written notice to terminate a
I tried different prompts and even the model parameters to see if the response gets fixed, but had no luck.
I've also tried the same prompt and passage on 4 bit, 6 bit and 8 bit versions and also on the 4 bit version of llama 2 13b chat ggml. But the response was similar.
The same prompt and passage provided an accurate and precise response on the unquantized 7b model and also on the gptq version of this same model.
Does anybody know if this is an issue with the quantization format(ggml,gptq)?
Any help in fixing this issue is much appreciated.
Facing this same issue. It does generate response but not always relevant to the input query.
Try the unquantized model or the gptq quantized version. It did not give any such response.
I figured out the problem. I wan't using the prompt template properly. Now, I'm using:
sys_prompt = "You are an AI bot. Don't greet or apologize to user. Give straight-forward response to what the user says."
final_prompt = f"""<s>[INST] <<SYS>>\n{sys_prompt}\n<</SYS>>\n\n{user_msg} [/INST]"""
I'm getting expected response using this prompt template.