
Hacker News: Front Page
shared a link post in group #Stream of Goodies
arxiv.org
Transformers are Multi-State RNNs
Transformers are considered conceptually different compared to the previous generation of state-of-the-art NLP models - recurrent neural networks (RNNs). In this work, we demonstrate that decoder-only