Michael Goin PRO
mgoin
AI & ML interests
LLM inference optimization, compression, quantization, pruning, distillation
Recent Activity
updated
a model
about 1 hour ago
nm-testing/pixtral-12b-FP8-dynamic-all
updated
a model
3 days ago
mistralai/Pixtral-Large-Instruct-2411
New activity
3 days ago
mistralai/Pixtral-Large-Instruct-2411
Organizations
mgoin's activity
Add config_format and load_format to vLLM args
#5 opened 3 days ago
by
mgoin
Update config.json to use null for sliding_window
#4 opened 3 days ago
by
mgoin
Adding `safetensors` variant of this model
#1 opened 9 days ago
by
SFconvertbot
Is this the standard GPTQ quantization?
1
#5 opened 20 days ago
by
molereddy
Model weights are not loaded
4
#3 opened 3 months ago
by
MarvelousMouse
Update model card
#1 opened 20 days ago
by
nm-research
Add chat_template to tokenizer_config.json
#1 opened 21 days ago
by
nm-research
Why is the Pixtral activation function "gelu" when the reference code uses "silu"?
2
#10 opened about 1 month ago
by
mgoin
Update tokenizer_config.json with chat_template
2
#11 opened about 1 month ago
by
mgoin
Any chance your team is working on a 4-bit Llama-3.2-90B-Vision-Instruct-quantized.w4a16 version?
1
#1 opened about 2 months ago
by
mrhendrey
Oom with 24g vram
3
#1 opened about 2 months ago
by
Klopez
latest vllm docker (v0.6.2) fail to load
2
#1 opened about 2 months ago
by
choronz333
Issue with loading model
1
#1 opened 3 months ago
by
xSumukhax
Can it run on A100/A800 with VLLM?
3
#1 opened 4 months ago
by
Parkerlambert123
weights does not exist when trying to deploy in sagemaker endpoint
1
#1 opened 3 months ago
by
LorenzoCevolaniAXA
8-kv-heads
4
#17 opened 4 months ago
by
ArthurZ
8-kv-heads
3
#21 opened 4 months ago
by
ArthurZ
run with vllm
8
#4 opened 4 months ago
by
kuliev-vitaly
Not able to run Model using VLLM
1
#3 opened 4 months ago
by
Pchaudhary