ParaRNN [from Apple Research]
github.com·12h·
Discuss: Hacker News
Flag this post

ParaRNN

ParaRNN is a high-performance package for automating parallel application of RNNs along sequence-length, dramatically speed up RNN applications compared to traditional sequential approaches.

The code has been developed as part of the publication: ParaRNN: Unlocking Parallel Training of Nonlinear RNNs for Large Language Models


Overview

Traditional RNN processing requires updating the RNN hidden state as the input sequence gets analyzed: a procedure inherently sequential, which makes its application to long sequences time-consuming. ParaRNN overcomes this issue by implementing a combination of Newton method and parallel reduction algorithms which can effectively evaluate the RNN application in parallel along the sequence length. T…

Similar Posts

Loading similar posts...