Shuaipeng Li's picture

2

Shuaipeng Li

unlimblue

AI & ML interests

None yet

Recent Activity

authored a paper 1 day ago

Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs

authored a paper 2 days ago

Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

authored a paper 2 days ago

HMoE: Heterogeneous Mixture of Experts for Language Modeling

View all activity

Organizations

unlimblue's activity

authored a paper 1 day ago

Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs

Paper • 2407.12117 • Published Jul 16, 2024

authored 5 papers 2 days ago

Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

Paper • 2405.14578 • Published May 23, 2024 • 1

HMoE: Heterogeneous Mixture of Experts for Language Modeling

Paper • 2408.10681 • Published Aug 20, 2024 • 8

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Paper • 2411.02265 • Published Nov 4, 2024 • 24

More Expressive Attention with Negative Weights

Paper • 2411.07176 • Published Nov 11, 2024 • 1

3DCNN-DQN-RNN: A Deep Reinforcement Learning Framework for Semantic Parsing of Large-scale 3D Point Clouds

Paper • 1707.06783 • Published Jul 21, 2017

upvoted 2 papers 2 days ago

Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

Paper • 2405.14578 • Published May 23, 2024 • 1

Scaling Laws for Floating Point Quantization Training

Paper • 2501.02423 • Published 5 days ago • 18

authored a paper 3 days ago

Scaling Laws for Floating Point Quantization Training

Paper • 2501.02423 • Published 5 days ago • 18