ModelCloud
/

Llama-3.2-3B-Instruct-gptqmodel-4bit-vortex-mlx-v3

4-bit precision

Model card Files Files and versions Community

Update README.md

#2

by zx-modelcloud - opened 6 days ago

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

Files changed (1) hide show

README.md +29 -13

README.md CHANGED Viewed

@@ -1,26 +1,42 @@
-This model was exported using [GPTQModel](https://github.com/ModelCloud/GPTQModel). Below is example code for exporting a model from GPTQ format to MLX format.
-## Example:
-```python
-from gptqmodel import GPTQModel
-# load gptq quantized model
-gptq_model_path = "ModelCloud/Llama-3.2-3B-Instruct-gptqmodel-4bit-vortex-v3"
-mlx_path = f"./vortex/Llama-3.2-3B-Instruct-gptqmodel-4bit-vortex-v3-mlx"
-# export to mlx model
-GPTQModel.export(gptq_model_path, mlx_path, "mlx")
-# load mlx model check if it works
 from mlx_lm import load, generate
 mlx_model, tokenizer = load(mlx_path)
 prompt = "The capital of France is"
-messages = [{"role": "user", "content": prompt}]
-prompt = tokenizer.apply_chat_template(
-    messages, add_generation_prompt=True
 )
 text = generate(mlx_model, tokenizer, prompt=prompt, verbose=True)
 ```

+This model was quantized and exported to mlx using [GPTQModel](https://github.com/ModelCloud/GPTQModel).
+## How to run this model
+```shell
+# install mlx
+pip install mlx_lm
+```
+```python
 from mlx_lm import load, generate
+mlx_path = "ModelCloud/Llama-3.2-3B-Instruct-gptqmodel-4bit-vortex-mlx-v3"
 mlx_model, tokenizer = load(mlx_path)
 prompt = "The capital of France is"
 )
 text = generate(mlx_model, tokenizer, prompt=prompt, verbose=True)
+```
+### Export gptq to mlx
+```shell
+# install gptqmodel with mlx
+pip install gptqmodel[mlx] --no-build-isolation
+```
+```python
+from gptqmodel import GPTQModel
+# load gptq quantized model
+gptq_model_path = "ModelCloud/Llama-3.2-3B-Instruct-gptqmodel-4bit-vortex-v3"
+mlx_path = f"./vortex/Llama-3.2-3B-Instruct-gptqmodel-4bit-vortex-mlx-v3"
+# export to mlx model
+GPTQModel.export(gptq_model_path, mlx_path, "mlx")
 ```