license: bsd-3-clause | |
#Interactor | |
Rethinking Transformer Architecture with Parameter Attention, Layer Normalization Nonlinearity and Gated Linear Unit Parametrized Memory | |
Paper Coming Soon |
license: bsd-3-clause | |
#Interactor | |
Rethinking Transformer Architecture with Parameter Attention, Layer Normalization Nonlinearity and Gated Linear Unit Parametrized Memory | |
Paper Coming Soon |