Songlin Yang's picture

1 6 4

Songlin Yang

sonta7

·

https://sustcsonglin.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper 14 days ago

Gated Delta Networks: Improving Mamba2 with Delta Rule

liked a dataset about 1 month ago

codeparrot/codeparrot-clean-valid

View all activity

Organizations

sonta7's activity

upvoted a paper 14 days ago

Gated Delta Networks: Improving Mamba2 with Delta Rule

Paper • 2412.06464 • Published 15 days ago • 9

upvoted an article 2 months ago

Article

History of State Space Models (SSM) in 2022

By

•

Apr 11

• 15

upvoted 2 papers 3 months ago

A Controlled Study on Long Context Extension and Generalization in LLMs

Paper • 2409.12181 • Published Sep 18 • 43

Gated Slot Attention for Efficient Linear-Time Sequence Modeling

Paper • 2409.07146 • Published Sep 11 • 19

upvoted a paper 7 months ago

Parallelizing Linear Transformers with the Delta Rule over Sequence Length

Paper • 2406.06484 • Published Jun 10 • 3

upvoted a collection 8 months ago

based

These language model checkpoints are trained at the 360M and 1.3Bn parameter scales for up to 50Bn tokens on the Pile corpus, for research purposes. • 15 items • Updated Oct 18 • 9