不支持sglang

#18
by zhangdahaodaddy - opened

docker run --name vllm-minicpm3-4b --runtime nvidia --gpus '"device=0,1"' -v /home/jszc/vllm:/root/.cache/modelscope -p 11436:8080 --ipc=host my_vllm_updated:latest --model /root/.cache/modelscope/MiniCPM3-4B --port=8080 --served-model-name llama3.1-8b --gpu-memory-utilization 0.95 --max_num_seqs 1024 --max_num_batched_tokens 8192 -tp 2 --enable-chunked-prefill true --enable-prefix-caching --trust-remote-code

OpenBMB org

The latest main branch of sglang has merged our commit about minicpm3. You can wait for the next release of sglang or build from source to get a version that supports minicpm3.

neoz changed discussion status to closed

The latest main branch of sglang has merged our commit about minicpm3. You can wait for the next release of sglang or build from source to get a version that supports minicpm3.

Thank you for your reply

Sign up or log in to comment