ITNet: A Learnable Integral Transform That Subsumes Convolution, Attention, and Recurrence (opens in new tab)

Convolutional networks, recurrent networks, and transformers each encode different inductive biases -- locality, sequential memory, and content-dependent pairwise interaction -- and have remained mathematically distinct since their inception. We show that this fragmentation reflects not a fundamental diversity in how signals should be processed, but rather incomplete views of a single underlying mathematical object: a learnable integral transfor...

Read the original article