Attention substitution with Attention Free Transformer Gating Function Generation
Paper Coming Soon
-