Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency Paper • 2311.02772 • Published Nov 5, 2023 • 3
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities Paper • 2311.05698 • Published Nov 9, 2023 • 9
Hiformer: Heterogeneous Feature Interactions Learning with Transformers for Recommender Systems Paper • 2311.05884 • Published Nov 10, 2023 • 5
Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers Paper • 2311.10642 • Published Nov 17, 2023 • 23