Mistral-Small-24B-Instruct-2501 Variant Round Up + Docker and Inference Quick Start
#40
by
daybytez
- opened
Hello World
I'm experimenting with a new format to compare useful model variants, as well as provide useful starting points for download, docker and inference.
Please let me know if you come across other inference providers, or cool variants you think should be included. Thank you!
Variant | RAM / VRAM Needed | Notes (Usage) | Download (Formatted) | Run Instantly |
---|---|---|---|---|
Mistral-Small-24B-Instruct-2501 | ~50β52 GB (est., fp16) | Official release; instruction-tuned for general chat & API | HF safetensors | Bytez Instruct |
yentinglin/Mistral-Small-24B-Instruct-2501-reasoning | ~50β52 GB (est., fp16) | Fine-tuned for math & reasoning tasks | HF safetensors | β |
unsloth/Mistral-Small-24B-Instruct-2501-GGUF | ~14β28 GB (TheBloke GGUF) | Quantized GGUF; consumer GPU/laptop friendly | HF GGUF | β |
casperhansen/mistral-small-24b-instruct-2501-awq | ~4β8 GB (est., 4-bit AWQ) | 4-bit quantization; runs on small GPUs | HF AWQ | β |
huihui_ai/Mistral-Small-24B-Abliterated | ~14 GB (Q4_K_M comm.) | βUncensoredβ fork; research / red-team use | Ollama fork | β |
π Docker Quickstart
version: "3.8"
services:
mistral_small_24b:
image: ghcr.io/bytez-com/models/mistralai/mistral-small-24b-instruct-2501:latest
ports:
- "8080:80"
Run request:
docker compose up -d
curl -X POST http://localhost:8080/generate \
-H "Content-Type: application/json" \
-d '{"prompt":"Explain the significance of 24B parameter models.","max_tokens":64}'
π Bytez SDK Quickstart (Node.js)
npm i bytez.js
# or
yarn add bytez.js
import Bytez from "bytez.js"
const sdk = new Bytez("YOUR_API_KEY")
const model = sdk.model("mistralai/Mistral-Small-24B-Instruct-2501")
const { error, output } = await model.run(
"Explain the significance of 24B parameter models."
)
console.log({ error, output })