Intel
/

MiniMax-M2.5-int4-AutoRound

Text Generation

4-bit precision

Model card Files Files and versions

Xuehao commited on 7 days ago

Commit

73c4782

·

verified ·

1 Parent(s): e738fb2

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -23,7 +23,7 @@ uv pip install vllm --torch-backend=auto
 from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
 import torch
-MODEL_PATH = "INC4AI/MiniMax-M2.5-int4-mixed-AutoRound"
 model = AutoModelForCausalLM.from_pretrained(
     MODEL_PATH,
@@ -50,7 +50,7 @@ print(response)
 ### VLLM Usage
 ```bash
-vllm serve INC4AI/MiniMax-M2.5-int4-mixed-AutoRound \
     --port 7777 \
     --host localhost \
     --trust-remote-code \
@@ -64,7 +64,7 @@ vllm serve INC4AI/MiniMax-M2.5-int4-mixed-AutoRound \
 ## Generate the Model
 ```bash
-auto-round --model_name MiniMaxAI/MiniMax-M2.5 --scheme w4a16 --ignore_layers gate --iters 0 --output_dir MiniMax-M2.5-int4-mixed-AutoRound
 ```
 ## Ethical Considerations and Limitations

 from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
 import torch
+MODEL_PATH = "Intel/MiniMax-M2.5-int4-AutoRound"
 model = AutoModelForCausalLM.from_pretrained(
     MODEL_PATH,
 ### VLLM Usage
 ```bash
+vllm serve Intel/MiniMax-M2.5-int4-AutoRound \
     --port 7777 \
     --host localhost \
     --trust-remote-code \
 ## Generate the Model
 ```bash
+auto-round --model_name MiniMaxAI/MiniMax-M2.5 --scheme w4a16 --ignore_layers gate --iters 0 --output_dir MiniMax-M2.5-int4-AutoRound
 ```
 ## Ethical Considerations and Limitations