Instructions to use RepublicOfKorokke/GLM-4.7-Flash-oQ3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use RepublicOfKorokke/GLM-4.7-Flash-oQ3 with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("RepublicOfKorokke/GLM-4.7-Flash-oQ3") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- Pi new
How to use RepublicOfKorokke/GLM-4.7-Flash-oQ3 with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "RepublicOfKorokke/GLM-4.7-Flash-oQ3"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "RepublicOfKorokke/GLM-4.7-Flash-oQ3" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use RepublicOfKorokke/GLM-4.7-Flash-oQ3 with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "RepublicOfKorokke/GLM-4.7-Flash-oQ3"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default RepublicOfKorokke/GLM-4.7-Flash-oQ3
Run Hermes
hermes
- MLX LM
How to use RepublicOfKorokke/GLM-4.7-Flash-oQ3 with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "RepublicOfKorokke/GLM-4.7-Flash-oQ3"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "RepublicOfKorokke/GLM-4.7-Flash-oQ3" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RepublicOfKorokke/GLM-4.7-Flash-oQ3", "messages": [ {"role": "user", "content": "Hello"} ] }'
GLM-4.7-Flash-oQ3
This model was quantized using oQ mixed-precision quantization.
Quantization details
- Model type: glm4_moe_lite
- Bits: 3
- Group size: 64
- Format: MLX safetensors
Benchmark
| Model | File size | MMLU | JMMLU | HELLASWAG | ARC_CHALLENGE | GSM8K |
|---|---|---|---|---|---|---|
| GLM-4.7-Flash-MLX-6bit | 22.68 GB | 71.3% | 63.3% | 69.0% | 86.0% | 89.3% |
| GLM-4.7-Flash-oQ3 | 12.93 GB | 63.7% | 56.3% | 62.0% | 80.3% | 88.0% |
| GLM-4.7-Flash-oQ3.5 | 14.00 GB | 63.7% | 56.7% | 59.3% | 78.7% | 84.0% |
| GLM-4.7-Flash-oQ4 | 16.4 GB | 71.0% | 60.0% | 62.0% | 84.3% | 92.0% |
| GLM-4.7-Flash-REAP-23B-A3B-6bit | 17.43 GB | 62.3% | 46.0% | - | - | - |
| GLM-4.7-Flash-REAP-23B-A3B-oQ3 | 9.91 GB | 53.3% | 38.3% | 47.7% | 73.3% | 73.3% |
| GLM-4.7-Flash-REAP-23B-A3B-oQ3.5 | 10.62 GB | 57.7% | 49.3% | - | - | - |
| GLM-4.7-Flash-REAP-23B-A3B-oQ4 | 12.51 GB | 59.3% | 43.0% | 53.3% | 78.7% | 87.7% |
| GLM-4.7-Flash-REAP-23B-A3B-oQ5 | 15.21 GB | 61.0% | 45.3% | 59.0% | 81.0% | 90.0% |
Detail
| Model | Benchmark | Accuracy | Correct | Total | Time(s) |
|---|---|---|---|---|---|
| GLM-4.7-Flash-MLX-6bit | MMLU | 71.3% | 214 | 300 | 533.4 |
| GLM-4.7-Flash-MLX-6bit | JMMLU | 63.3% | 190 | 300 | 260.3 |
| GLM-4.7-Flash-MLX-6bit | HELLASWAG | 69.0% | 207 | 300 | 305.7 |
| GLM-4.7-Flash-MLX-6bit | ARC_CHALLENGE | 86.0% | 258 | 300 | 200.5 |
| GLM-4.7-Flash-MLX-6bit | GSM8K | 89.3% | 268 | 300 | 813.9 |
| GLM-4.7-Flash-oQ3 | MMLU | 63.7% | 191 | 300 | 554.4 |
| GLM-4.7-Flash-oQ3 | JMMLU | 56.3% | 169 | 300 | 433.9 |
| GLM-4.7-Flash-oQ3 | HELLASWAG | 62.0% | 186 | 300 | 355.8 |
| GLM-4.7-Flash-oQ3 | ARC_CHALLENGE | 80.3% | 241 | 300 | 196.4 |
| GLM-4.7-Flash-oQ3 | GSM8K | 88.0% | 264 | 300 | 857.8 |
| GLM-4.7-Flash-oQ3.5 | MMLU | 63.7% | 191 | 300 | 564.6 |
| GLM-4.7-Flash-oQ3.5 | JMMLU | 56.7% | 170 | 300 | 439.6 |
| GLM-4.7-Flash-oQ3.5 | HELLASWAG | 59.3% | 178 | 300 | 335.4 |
| GLM-4.7-Flash-oQ3.5 | ARC_CHALLENGE | 78.7% | 236 | 300 | 192.8 |
| GLM-4.7-Flash-oQ3.5 | GSM8K | 84.0% | 252 | 300 | 859.4 |
| GLM-4.7-Flash-oQ4 | MMLU | 71.0% | 213 | 300 | 569 |
| GLM-4.7-Flash-oQ4 | JMMLU | 60.0% | 180 | 300 | 297.9 |
| GLM-4.7-Flash-oQ4 | HELLASWAG | 62.0% | 186 | 300 | 346.3 |
| GLM-4.7-Flash-oQ4 | ARC_CHALLENGE | 84.3% | 253 | 300 | 190.9 |
| GLM-4.7-Flash-oQ4 | GSM8K | 92.0% | 276 | 300 | 820.9 |
| GLM-4.7-Flash-REAP-23B-A3B-6bit | MMLU | 62.3% | 187 | 300 | 505.9 |
| GLM-4.7-Flash-REAP-23B-A3B-6bit | JMMLU | 46.0% | 138 | 300 | 239.7 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ3 | MMLU | 53.3% | 160 | 300 | 602.7 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ3 | JMMLU | 38.3% | 115 | 300 | 255.7 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ3 | HELLASWAG | 47.7% | 143 | 300 | 346.8 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ3 | ARC_CHALLENGE | 73.3% | 220 | 300 | 204.8 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ3 | GSM8K | 73.3% | 220 | 300 | 1029.3 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ3.5 | MMLU | 57.7% | 173 | 300 | 555.1 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ3.5 | JMMLU | 49.3% | 148 | 300 | 252.4 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ4 | MMLU | 63.3% | 190 | 300 | 550.7 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ4 | JMMLU | 39.7% | 119 | 300 | 250.9 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ4 | MMLU | 59.3% | 178 | 300 | 547.7 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ4 | JMMLU | 43.0% | 129 | 300 | 232.6 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ4 | HELLASWAG | 53.3% | 160 | 300 | 300.5 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ4 | ARC_CHALLENGE | 78.7% | 236 | 300 | 179.7 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ4 | GSM8K | 87.7% | 263 | 300 | 748.4 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ5 | MMLU | 61.0% | 183 | 300 | 617.8 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ5 | JMMLU | 45.3% | 136 | 300 | 273 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ5 | HELLASWAG | 59.0% | 177 | 300 | 353.6 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ5 | ARC_CHALLENGE | 81.0% | 243 | 300 | 201.2 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ5 | GSM8K | 90.0% | 270 | 300 | 1001.1 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ5 | MMLU | 61.0% | 183 | 300 | 617.8 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ5 | JMMLU | 45.3% | 136 | 300 | 273 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ5 | HELLASWAG | 59.0% | 177 | 300 | 353.6 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ5 | ARC_CHALLENGE | 81.0% | 243 | 300 | 201.2 |
| GLM-4.7-Flash-REAP-23B-A3B-oQ5 | GSM8K | 90.0% | 270 | 300 | 1001.1 |
- Downloads last month
- 100
Model size
4B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
3-bit
Model tree for RepublicOfKorokke/GLM-4.7-Flash-oQ3
Base model
zai-org/GLM-4.7-Flash