Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity Paper • 2502.13063 • Published 21 days ago • 65
view post Post 5587 RWKV-7 "Goose" preview rc2 => Peak RNN architecture?😃Will try to squeeze more performance for the final release. Preview code & model: https://github.com/BlinkDL/RWKV-LM/tree/main/RWKV-v7 2 replies · 👀 11 11 🚀 4 4 👍 3 3 ❤️ 2 2 🔥 1 1 + Reply
Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task Paper • 2406.14213 • Published Jun 20, 2024 • 21