FilipV's picture
3 1

FilipV

PheelaV
·

AI & ML interests

SW&DS&AI

Recent Activity

Organizations

None yet

PheelaV's activity

upvoted 2 articles about 1 month ago
view article
Article

Accelerating Language Model Inference with Mixture of Attentions

By hba123 and 1 other
24
replied to lbourdois's post 10 months ago
view reply

Brilliant stuff. Personally I would love to see the discretization and latest A initialization digested. Looking forward to the upcoming posts!

upvoted an article 10 months ago
reacted to lbourdois's post with ❤️ 10 months ago
view post
Post
3638
I stopped procrastinating and finally took the time to write the second article of my series of blog posts on SSM: https://huggingface.co/blog/lbourdois/ssm-2022.
In this blog post, I review the history of SSM models released in 2022, with over 14 models discussed in a synthetic format.
They are separated into two parts: "theoretical" (DSS, S4D, GSS, Mega, S5, etc.) and "applications" (Sashimi, ViS4mer, CCNN, etc.).

To understand everything, it's best to have read the introduction to S4 to SSM blog post first: https://huggingface.co/blog/lbourdois/get-on-the-ssm-train.
All the articles in the series are listed in this space: lbourdois/SSM_blog_posts

Wishing you a good reading :)
  • 2 replies
·
replied to ArthurZ's post 11 months ago
view reply

2.8b model is that the Pile or SlimPJ trained one?