134 7 29

Steve Li

CHNtentes

CHNtentes

AI & ML interests

None yet

Recent Activity

new activity about 12 hours ago

zai-org/GLM-4.6-FP8:Missing MTP?

upvoted a paper 1 day ago

Less is More: Recursive Reasoning with Tiny Networks

new activity 1 day ago

unsloth/GLM-4.5-Air-GGUF:What speed do you get at Q8 on AMD Ryzen™ AI Max+ 395

View all activity

Organizations

None yet

New activity in zai-org/GLM-4.6-FP8 about 12 hours ago

Missing MTP?

#1 opened 8 days ago by

jondurbin

New activity in unsloth/GLM-4.5-Air-GGUF 1 day ago

What speed do you get at Q8 on AMD Ryzen™ AI Max+ 395

#14 opened 5 days ago by

akierum

New activity in Qwen/Qwen3-Omni-30B-A3B-Instruct 14 days ago

Quantization issues

#17 opened 15 days ago by

stev236

New activity in Qwen/Qwen3-Omni-30B-A3B-Instruct 18 days ago

Update README.md

#9 opened 18 days ago by

CHNtentes

New activity in unsloth/Qwen3-Next-80B-A3B-Instruct-bnb-4bit 25 days ago

Error when using vLLM

➕ 11

#2 opened 28 days ago by

sheliak

New activity in LLM360/K2-Think 26 days ago

Evaluation sloppiness / benchmark cheating?

👍 2

#9 opened 26 days ago by

jaens

New activity in unsloth/GLM-4.5-Air-GGUF 28 days ago

Corrected jinja template with tool Support works with PR llama.cpp/pull/15186

❤️ 2

#9 opened 2 months ago by

xbruce22

New activity in cpatonn/Qwen3-Next-80B-A3B-Instruct-AWQ-4bit 29 days ago

Error when running in VLLM

👍 2

#1 opened 29 days ago by

d8rt8v

New activity in Qwen/Qwen3-Next-80B-A3B-Instruct 29 days ago

How much GPU memory is needed for local deployment?

#7 opened 30 days ago by

XuehangCang

Plan for AWQ?

➕ 24

#8 opened 30 days ago by

hyunw55

New activity in moonshotai/Kimi-K2-Instruct-0905 about 1 month ago

Considering a distilled version of 80B parameters

➕ 1

#2 opened about 1 month ago by

snapo

New activity in unsloth/DeepSeek-V3.1-GGUF about 2 months ago

changed tool call format?

#2 opened about 2 months ago by

CHNtentes

Thanks!

❤️ 2

#1 opened about 2 months ago by

segmond

New activity in deepseek-ai/DeepSeek-V3.1 about 2 months ago

Context length: is it 128K (as mentioned in the model card) or 160K (as specified in config.json)?

#17 opened about 2 months ago by

Lissanro

bro you forgot to put it under the v3.1 collection

👀 1

#12 opened about 2 months ago by

DingzhenPearl

New activity in deepseek-ai/DeepSeek-V3.1-Base about 2 months ago

Why taking so long to add more information?

#12 opened about 2 months ago by

cinnybun02

New activity in nvidia/NVIDIA-Nemotron-Nano-9B-v2 about 2 months ago

This just trades general performance for domain specific gains.

🔥 👍 15

#3 opened about 2 months ago by

phil111

New activity in Qwen/Qwen3-4B-Instruct-2507 about 2 months ago

1.7b 2507?

👍 5

#7 opened about 2 months ago by

CHNtentes

New activity in google/gemma-3-270m-it about 2 months ago

Gemma A3B

👍 5

#3 opened about 2 months ago by

Maria99934

New activity in zai-org/GLM-4.5V 2 months ago

Text performance compared to GLM-4.5 Air

👀 5

#1 opened 2 months ago by

Dampfinchen

Steve Li

AI & ML interests

Recent Activity

Organizations

CHNtentes's activity

Missing MTP?

What speed do you get at Q8 on AMD Ryzen™ AI Max+ 395

Quantization issues

Update README.md

Error when using vLLM

Evaluation sloppiness / benchmark cheating?

Corrected jinja template with tool Support works with PR llama.cpp/pull/15186

Error when running in VLLM

How much GPU memory is needed for local deployment?

Plan for AWQ?

Considering a distilled version of 80B parameters

changed tool call format?

Thanks!

Context length: is it 128K (as mentioned in the model card) or 160K (as specified in config.json)?

bro you forgot to put it under the v3.1 collection

Why taking so long to add more information?

This just trades general performance for domain specific gains.

1.7b 2507?

Gemma A3B

Text performance compared to GLM-4.5 Air