run in vs code
hi , i just succesfully run, mistral 7b instruct gguf file using ctransformer, but the ai is not responding according to user input
how to solve this
Can you share how did you ran it in VS CODE?
downloaded the gguf files, and used ctrasnformer to run these code, which is really computational effective, but it takes more time print the output
instead of real time printing
Even I tried ctransfromers but it is giving segmentation fault, afterwards I tried with llama-cpp-python it worked. Can you share how you did.
from ctransformers import AutoModelForCausalLM
import gradio as gr
llm = AutoModelForCausalLM.from_pretrained(
"TheBloke/Mistral-7B-Instruct-v0.2-GGUF",
model_file="mistral-7b-instruct-v0.2.Q4_K_M.gguf",
model_type="llama",
gpu_layers=0 )
title= "Shivansh Model"
def llm_func(message,history):
response=llm(message)
return response
gr.ChatInterface(
fn=llm_func,
title=title,
).launch()
This gave segmentation fault.
On the other hand:->
from langchain.llms import LlamaCpp
import gradio as gr
def load_llm():
llm = LlamaCpp(
model_path="../model/mistral-7b-instruct-v0.2.Q4_K_M.gguf",
max_new_tokens=512,
temperature=0.1
)
return llm
title= "Shivansh Model"
def llm_func(message,history):
llm=load_llm()
response=llm(message)
return response
gr.ChatInterface(
fn=llm_func,
title=title,
).launch()
This is working good.
Could you share your code.
from langchain.llms import CTransformers
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
import time
import json
Load model configuration from the specified path
config_path = "D:/Project File/restart/NEW AI/config.json"
with open(config_path, 'r') as config_file:
model_config = json.load(config_file)
Extract specific parameters
load_params = model_config.get('load_params', {})
Use the extracted parameters in your main code
model_path = "D:/Project File/restart/NEW AI/mistral-7b-instruct-v0.1.Q2_K.gguf"
Initialize LangChain's CTransformers with StreamingStdOutCallbackHandler
llm = CTransformers(
model=model_path,
callbacks=[StreamingStdOutCallbackHandler()]
)
Initialize conversation history
conversation_history = []
prompt_template = "[INST] {prompt} [/INST]"
while True:
prompt_template = {
"pre_prompt": "You are an artificial intelligence called VICTOR, Victor stands for Virtual Intelligent Companion for Technological Optimization and Reinforcement, created by Arun Raj, a Physics student.you are a friend of the user, you aim to keep our conversation very concise and engaging you, ",
"pre_prompt_suffix": "",
"pre_prompt_prefix": "",
"input_prefix": "[INST]",
"input_suffix": "[/INST]",
"antiprompt": ["[INST]"],
}
user_input = input("You: ")
if user_input.lower() == "quit":
break
formatted_input = f"{prompt_template['input_prefix']}{user_input}{prompt_template['input_suffix']}"
print("\nYou:", user_input)
response = llm(prompt_template['pre_prompt'] + formatted_input)
if conversation_history and response != conversation_history[-1][1]: # Check if conversation history is not empty
print()
conversation_history.append(("User", user_input))
conversation_history.append(("AI", response))
print("Chatbot session ended.")
in this code i am still facing memory issue,
can you help me with memory
Can you share your hardware details, so I can help you.
ryzen 5 5600h, nvdia gtx 1650and amd radieon gpu, 24gb ram
i mean conversational memory, the model doesn't remember previous interaction,
you can try out langchain.memory for conversational/contextual memory.
yeah i tried that now the context window is 8k but don't have long term memory, so planning to integrate a database.
do you know any open source text to speech library for speaking in live stream output
no I have not expored that area yet
It's ok... Can you tell me how you used llama. Cpp?
I have some errors while installing llama. Cpp... If you know..., then please text me in insta -Arun_luka