Predicting the Order of Upcoming Tokens Improves Language Modeling Paper • 2508.19228 • Published Aug 26 • 22