GHOSTAI — HORROR GGUF (7B)
A focused, horror-themed 7B model released exclusively in quantized GGUF format for the llama.cpp ecosystem.
_{Quantized-only release. No FP16 weights included.}

Overview

GHOSTAI is a compact, atmosphere-driven horror model designed for narrative generation, roleplay, and dark storytelling.
It prioritizes tone, pacing, and vivid imagery over generic assistant behavior.

This repository provides multiple GGUF quantizations, allowing you to choose the best balance of quality, speed, and memory usage for your hardware.

The model runs:

Fully on CPU
With optional GPU offload (CUDA / Metal / Vulkan builds of llama.cpp)

Quantization choice is independent of whether you use CPU or GPU.

Files

File	Quant	Approx size	Rough RAM needed (4k ctx)
`ghostai-horror-7b.Q8_0.gguf`	Q8_0	~7.2 GB	~10–11 GB
`ghostai-horror-7b.Q6_K.gguf`	Q6_K	~5.5 GB	~8–9 GB
`ghostai-horror-7b.Q5_K_M.gguf`	Q5_K_M	~4.8 GB	~7–8 GB
`ghostai-horror-7b.Q5_K_S.gguf`	Q5_K_S	~4.7 GB	~7–8 GB
`ghostai-horror-7b.Q4_K_M.gguf`	Q4_K_M	~4.1 GB	~6–7 GB
`ghostai-horror-7b.Q4_K_S.gguf`	Q4_K_S	~3.9 GB	~6–7 GB
`ghostai-horror-7b.Q3_K_M.gguf`	Q3_K_M	~3.3 GB	~5–6 GB
`ghostai-horror-7b.Q3_K_S.gguf`	Q3_K_S	~3.0 GB	~5–6 GB
`ghostai-horror-7b.Q2_K.gguf`	Q2_K	~2.5 GB	~4–5 GB
`ghostai-horror-7b.TQ1_0.gguf`	TQ1_0	~1.6 GB	~3–4 GB

Notes:

“Rough RAM needed” assumes ~4k context and typical llama.cpp overhead.
For 8k context, plan +1–2 GB extra.
GPU offload can shift some load to VRAM, but you still need system RAM.

Recommended Downloads

Best default: Q4_K_M
More quality (more RAM): Q5_K_M, Q6_K, Q8_0
Low RAM: Q3_K_S, Q2_K
Ultra-small / experimental: TQ1_0 (expect noticeable quality loss)

Quickstart (llama.cpp)

1) Run on CPU

./llama-cli \
  -m ghostai-horror-7b.Q4_K_M.gguf \
  -c 4096 \
  -t 8 \
  -p "You are GHOSTAI. Speak like a calm horror narrator. Keep it tight and vivid."

Downloads last month: 205

GGUF

Model size

7B params

Architecture

llama

Hardware compatibility

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit