Model Card for alokabhishek/Mistral-7B-Instruct-v0.2-GGUF
This repo GGUF quantized version of Mistral AI_'s Mistral-7B-Instruct-v0.2 model using llama.cpp.
Model Details
- Model creator: Mistral AI_
- Original model: Mistral-7B-Instruct-v0.2
About GGUF quantization using llama.cpp
- llama.cpp github repo: llama.cpp github repo
How to Get Started with the Model
Use the code below to get started with the model.
How to run from Python code
First install the package
# Base ctransformers with CUDA GPU acceleration
! pip install ctransformers[cuda]>=0.2.24
# Or with no GPU acceleration
# ! pip install ctransformers>=0.2.24
! pip install -U sentence-transformers
! pip install transformers huggingface_hub torch
Import
from ctransformers import AutoModelForCausalLM
from transformers import pipeline, AutoModel, AutoTokenizer
from sentence_transformers import SentenceTransformer
import os
Use a pipeline as a high-level helper
# Load LLM and Tokenizer
model_mistral = AutoModelForCausalLM.from_pretrained(
"alokabhishek/Mistral-7B-Instruct-v0.2-GGUF",
model_file="mistral-7b-instruct-v0.2.Q4_K_M.gguf", # replace Q4_K_M.gguf with Q5_K_M.gguf as needed
model_type="mistral",
gpu_layers=50, # Use `gpu_layers` to specify how many layers will be offloaded to the GPU.
hf=True
)
tokenizer_mistral = AutoTokenizer.from_pretrained(
"alokabhishek/Mistral-7B-Instruct-v0.2-GGUF", use_fast=True
)
# Create a pipeline
pipe_mistral = pipeline(model=model_mistral, tokenizer=tokenizer_mistral, task='text-generation')
prompt_mistral = "Tell me a funny joke about Large Language Models meeting a Blackhole in an intergalactic Bar."
output_mistral = pipe_mistral(prompt_mistral, max_new_tokens=512)
print(output_mistral[0]["generated_text"])
Uses
Direct Use
[More Information Needed]
Downstream Use [optional]
[More Information Needed]
Out-of-Scope Use
[More Information Needed]
Bias, Risks, and Limitations
[More Information Needed]
Evaluation
Results
[More Information Needed]
Model Card Authors [optional]
[More Information Needed]
Model Card Contact
[More Information Needed]
- Downloads last month
- 67
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.