Mistral-Small-24B-Instruct-2501 Variant Round Up + Docker and Inference Quick Start

#40
by daybytez - opened

Hello World

I'm experimenting with a new format to compare useful model variants, as well as provide useful starting points for download, docker and inference.
Please let me know if you come across other inference providers, or cool variants you think should be included. Thank you!

Variant RAM / VRAM Needed Notes (Usage) Download (Formatted) Run Instantly
Mistral-Small-24B-Instruct-2501 ~50–52 GB (est., fp16) Official release; instruction-tuned for general chat & API HF safetensors Bytez Instruct
yentinglin/Mistral-Small-24B-Instruct-2501-reasoning ~50–52 GB (est., fp16) Fine-tuned for math & reasoning tasks HF safetensors ❓
unsloth/Mistral-Small-24B-Instruct-2501-GGUF ~14–28 GB (TheBloke GGUF) Quantized GGUF; consumer GPU/laptop friendly HF GGUF ❓
casperhansen/mistral-small-24b-instruct-2501-awq ~4–8 GB (est., 4-bit AWQ) 4-bit quantization; runs on small GPUs HF AWQ ❓
huihui_ai/Mistral-Small-24B-Abliterated ~14 GB (Q4_K_M comm.) β€œUncensored” fork; research / red-team use Ollama fork ❓

πŸš€ Docker Quickstart

version: "3.8"
services:
  mistral_small_24b:
    image: ghcr.io/bytez-com/models/mistralai/mistral-small-24b-instruct-2501:latest
    ports:
      - "8080:80"

Run request:

docker compose up -d
curl -X POST http://localhost:8080/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt":"Explain the significance of 24B parameter models.","max_tokens":64}'

🐍 Bytez SDK Quickstart (Node.js)

npm i bytez.js
# or
yarn add bytez.js
import Bytez from "bytez.js"

const sdk = new Bytez("YOUR_API_KEY")
const model = sdk.model("mistralai/Mistral-Small-24B-Instruct-2501")

const { error, output } = await model.run(
  "Explain the significance of 24B parameter models."
)

console.log({ error, output })

Sign up or log in to comment