Haoze Wu's picture

2 7

Haoze Wu

WaitHZ

·

https://waithz.github.io/

AI & ML interests

Modular DL, Complex Reasoning

Recent Activity

upvoted an article about 4 hours ago

You could have designed state of the art positional encoding

upvoted a paper 12 days ago

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

upvoted a paper 12 days ago

Autonomy-of-Experts Models

View all activity

Organizations

None yet

WaitHZ's activity

upvoted an article about 4 hours ago

Article

You could have designed state of the art positional encoding

Nov 25, 2024

• 142

upvoted 2 papers 12 days ago

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Paper • 2501.13629 • Published 13 days ago • 42

Autonomy-of-Experts Models

Paper • 2501.13074 • Published 14 days ago • 40

upvoted 2 papers 5 months ago

Benchmarking Chinese Knowledge Rectification in Large Language Models

Paper • 2409.05806 • Published Sep 9, 2024 • 14

OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs

Paper • 2409.05152 • Published Sep 8, 2024 • 31

upvoted a paper 7 months ago

GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory

Paper • 2406.12375 • Published Jun 18, 2024 • 1

upvoted a paper 10 months ago

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 104