QGML with Starcoder.cpp only needs less than 3GB GPU
#4
by
DevElCuy
- opened
First of all thanks for this great model!
I managed to convert it to GGML with https://github.com/bigcode-project/starcoder.cpp
And this is how I load it:
from ctransformers import (
AutoModelForCausalLM,
AutoTokenizer
)
from transformers import pipeline
from langchain.llms import HuggingFacePipeline
model_name = "models/abacaj--starcoderbase-1b-sft-ggml.bin"
model = AutoModelForCausalLM.from_pretrained(
model_name,
model_type="gpt_bigcode",
gpu_layers=1024,
hf=True
)
tokenizer = AutoTokenizer.from_pretrained(model)
pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
max_length=2048,
)
llm = HuggingFacePipeline(
pipeline=pipe,
)
Dependencies (requirements.txt):
torch==2.0.1
transformers==4.33.1
langchain==0.0.285
ctransformers==0.2.26