arxiv:2311.11045
Xuxi Chen
Xuxi
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
5 days ago
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and
Post-LN
upvoted
a
paper
15 days ago
APOLLO: SGD-like Memory, AdamW-level Performance
Organizations
None yet