arxiv:2412.14590
Zhen Zheng
JamesTheZ
AI & ML interests
Large scale machine learning system optimization.
Recent Activity
commented
a paper
11 days ago
MixLLM: LLM Quantization with Global Mixed-precision between
Output-features and Highly-efficient System Design
authored
a paper
12 days ago
MixLLM: LLM Quantization with Global Mixed-precision between
Output-features and Highly-efficient System Design
authored
a paper
12 days ago
BatchLLM: Optimizing Large Batched LLM Inference with Global Prefix
Sharing and Throughput-oriented Token Batching
Organizations
None yet
models
None public yet
datasets
None public yet