casperhansen/mistral-7b-instruct-v0.1-awq · Multiple questions regarding generated output

Hi,

Thanks for this AWQ model!

I have several questions regarding the output generated based on the code provided on the main page, which returns me the following output:

Fetching 10 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 118818.81it/s]
Replacing layers...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [00:02<00:00, 15.25it/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:2 for open-end generation.
[INST] What is your favourite condiment? [/INST] My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of dishes.

1/ How to solve the warnings on the attention mask and pad token?
2/ Why does the output return the prompt and not only the answer?
3/ Why does the output return the answer to the first question and not the second?