Mukul
mtcl
AI & ML interests
None yet
Recent Activity
new activity 5 days ago
nvidia/DeepSeek-V4-Pro-NVFP4:nvidia/DeepSeek-V4-flash-NVFP4 new activity 5 days ago
canada-quant/DeepSeek-V4-Flash-NVFP4-FP8-MTP:Docker Image new activity 6 days ago
unsloth/DeepSeek-V4-Flash:Worse than (smaller) MiniMax M2.7??Organizations
None yet
nvidia/DeepSeek-V4-flash-NVFP4
4
#1 opened 6 days ago
by
mtcl
Docker Image
5
#1 opened 6 days ago
by
mtcl
Worse than (smaller) MiniMax M2.7??
17
#2 opened about 1 month ago
by deleted
Unable to run on 2x RTX Pro 6000 (DEEP_GEMM problem)
➕ 10
17
#15 opened about 1 month ago
by
stev236
Running on 2 RTX Pro 6000 Blackwell GPUs at ~30 tps (Instructions that worked for me)
👍❤️ 7
10
#17 opened about 1 month ago
by
CarouselAether
2x Nvidia 6000 Pros
3
#2 opened about 1 month ago
by
mtcl
Will it work on 2X6000 Pros
6
#1 opened about 1 month ago
by
mtcl
Can I deploy it with sglang at my 8*4090 ubuntu sever?
9
#1 opened about 1 month ago
by
marshal007
Context Length for 2X6000 Pros (2x96 = 192GB VRAM)
3
#2 opened about 1 month ago
by
mtcl
really awesome speeds! running at 256k context.
🔥 1
5
#11 opened about 1 month ago
by
mtcl
MOE 122b and 397b please!
🚀 24
14
#7 opened about 1 month ago
by
jesleocizi
How to disable thinking?
4
#9 opened about 1 month ago
by
Hansi2024
These are NOT actual AWQ-quantized models.
2
#1 opened about 2 months ago
by
cai-cai
max context
#2 opened about 1 month ago
by
mtcl
No think tags.
10
#4 opened about 1 month ago
by
DrRos
Minimax M2.7 NVFP4
👀🔥 5
4
#4 opened about 2 months ago
by
mtcl
Unable to use full 192k context in SGLang with MiniMax-M2.7-NVFP4 (runtime capped at ~80,964 tokens)
3
#9 opened about 1 month ago
by
mtcl
w1 not matching w3 weight scales
12
#1 opened about 2 months ago
by
dareposte
tokenizer component mismatch and w1_weight_scale_2 must match w3_weight_scale_2. Accuracy may be affected issue
1
#5 opened about 2 months ago
by
mtcl
Minimax 2.7 !!!!
👍 5
3
#3 opened about 2 months ago
by
mtcl