Files changed (1) hide show
  1. README.md +29 -13
README.md CHANGED
@@ -1,26 +1,42 @@
1
- This model was exported using [GPTQModel](https://github.com/ModelCloud/GPTQModel). Below is example code for exporting a model from GPTQ format to MLX format.
2
 
3
- ## Example:
4
- ```python
5
- from gptqmodel import GPTQModel
6
 
7
- # load gptq quantized model
8
- gptq_model_path = "ModelCloud/Llama-3.2-3B-Instruct-gptqmodel-4bit-vortex-v3"
9
- mlx_path = f"./vortex/Llama-3.2-3B-Instruct-gptqmodel-4bit-vortex-v3-mlx"
10
 
11
- # export to mlx model
12
- GPTQModel.export(gptq_model_path, mlx_path, "mlx")
13
 
14
- # load mlx model check if it works
 
 
 
 
 
 
 
 
15
  from mlx_lm import load, generate
16
 
 
17
  mlx_model, tokenizer = load(mlx_path)
18
  prompt = "The capital of France is"
19
 
20
- messages = [{"role": "user", "content": prompt}]
21
- prompt = tokenizer.apply_chat_template(
22
- messages, add_generation_prompt=True
23
  )
24
 
25
  text = generate(mlx_model, tokenizer, prompt=prompt, verbose=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  ```
 
1
+ This model was quantized and exported to mlx using [GPTQModel](https://github.com/ModelCloud/GPTQModel).
2
 
3
+ ## How to run this model
 
 
4
 
 
 
 
5
 
 
 
6
 
7
+ ```shell
8
+ # install mlx
9
+ pip install mlx_lm
10
+ ```
11
+
12
+
13
+
14
+
15
+ ```python
16
  from mlx_lm import load, generate
17
 
18
+ mlx_path = "ModelCloud/Llama-3.2-3B-Instruct-gptqmodel-4bit-vortex-mlx-v3"
19
  mlx_model, tokenizer = load(mlx_path)
20
  prompt = "The capital of France is"
21
 
 
 
 
22
  )
23
 
24
  text = generate(mlx_model, tokenizer, prompt=prompt, verbose=True)
25
+ ```
26
+
27
+ ### Export gptq to mlx
28
+ ```shell
29
+ # install gptqmodel with mlx
30
+ pip install gptqmodel[mlx] --no-build-isolation
31
+ ```
32
+
33
+ ```python
34
+ from gptqmodel import GPTQModel
35
+
36
+ # load gptq quantized model
37
+ gptq_model_path = "ModelCloud/Llama-3.2-3B-Instruct-gptqmodel-4bit-vortex-v3"
38
+ mlx_path = f"./vortex/Llama-3.2-3B-Instruct-gptqmodel-4bit-vortex-mlx-v3"
39
+
40
+ # export to mlx model
41
+ GPTQModel.export(gptq_model_path, mlx_path, "mlx")
42
  ```