Model Card for Mistral-Large-Instruct-2411-MLX

This is a 4bit quantization of the Mistral Large Instruct 2411 model for MLX (Apple silicon). It was created using the mlx-lm library with the following CLI command: mlx_lm.convert
--hf-path /path/to/your/fp16/model
-q
--q-bits 4
--q-group-size 32

Quantized Versions

Each version is optimized for specific memory and performance trade-offs.

Original Model

The original Mistral-Large-Instruct-2411 model is available here. Mistral model usage is governed by the Mistral Research License.

License

This model family is governed by the Mistral Research License. Please review the license terms before use.

Model Details
- Model Description
Uses
- Direct Use
- Out-of-Scope Use
Bias, Risks, and Limitations
- Recommendations
Technical Specifications
How to Get Started

Model Details

Model Description

The Mistral-Large-Instruct-2411-MLX family includes quantized versions of the Mistral Large Instruct 2411 model, optimized for deployment on MLX (Apple Silicon). The quantization reduces memory usage and inference latency, enabling efficient deployment on resource-constrained systems.

Developed by: Mistral AI
Model type: Large language model
Language(s): English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Russian, Korean
Quantization levels: 2-bit (Q2), 4-bit (Q4)

Technical Specifications

Parent Model: Mistral-Large-Instruct-2411
Quantization: 2-bit (Q2), 4-bit (Q4)
Framework: MLX (mlx-lm library)

How to Get Started

Visit the individual quantized repositories for details and usage instructions:

Model Card Contact

For inquiries, contact Zach Landes.

zachlandes
/

Mistral-Large-Instruct-2411-Q4-MLX