Gemma 2 - Inference Endpoint

NOTICE: This model does, in fact run on inference endpoints. Just click deploy, unlike with regular GGUF models. The model is no longer stored, merely linked. Enjoy <3


{
  "inputs": "A plain old prompt with nothing else"
}

Multi turn coming soon...

Hello! I wrote a simple container that allows for easy running of llama-cpp-python with GGUF models. My goal here was a cheap way to play with Gemma, but then I thought maybe i'd share just in case it's helpful. I'll probably make a bunch of these, so if you have any requests for GGUF or otherwise quantized Llama.cpp models to become inference endpoints, please feel free to reach out!

Files

I used the excellent quant by lmstudio-ai/gemma-2b-it-GGUF,

My email is newp@justkidding.net

Just kidding, it's sam att samuellmeyers DOT... com

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support text-generation models for llama.cpp library.