RotRNN: Modelling Long Sequences with Rotations (opens in new tab)
arXiv:2407.07239v3 Announce Type: replace-cross Abstract: Linear recurrent neural networks, such as State Space Models (SSMs) and Linear Recurrent Units (LRUs), have recently shown state-of-the-art performance on long sequence modelling benchmarks. Despite their success, their empirical performance is not well understood and they come with a number of drawbacks, most notably their complex initialisation and normalisation schemes. In this work, we address some of these issues by proposing RotR...
Read the original article