Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
kd-tensor
's Collections
toread
Papers to read
Synthetic Data Generation
Papers to read
updated
Jul 7
Upvote
-
Jamba: A Hybrid Transformer-Mamba Language Model
Paper
•
2403.19887
•
Published
Mar 28
•
103
The Unreasonable Ineffectiveness of the Deeper Layers
Paper
•
2403.17887
•
Published
Mar 26
•
77
Tuning Language Models by Proxy
Paper
•
2401.08565
•
Published
Jan 16
•
20
Upvote
-
Share collection
View history
Collection guide
Browse collections