Generalization in Representation Models via Random Matrix Theory: Application to Recurrent Networks

View PDF

Abstract:We first study the generalization error of models that use a fixed feature representation (frozen intermediate layers) followed by a trainable readout layer. This setting encompasses a range of architectures, from deep random-feature models to echo-state networks (ESNs) with recurrent dynamics. Working in the high-dimensional regime, we apply Random Matrix Theory to derive a closed-form expression for the asymptotic generalization error. We then apply this analysis to recurrent representations and obtain concise formula that characterize their performance. Surprisingly, we show that a linear ESN is equivalent to ridge regression with an exponentially time-weighted (’‘memory’’) input covariance, revealing a clear inductive b…

View PDF

Abstract:We first study the generalization error of models that use a fixed feature representation (frozen intermediate layers) followed by a trainable readout layer. This setting encompasses a range of architectures, from deep random-feature models to echo-state networks (ESNs) with recurrent dynamics. Working in the high-dimensional regime, we apply Random Matrix Theory to derive a closed-form expression for the asymptotic generalization error. We then apply this analysis to recurrent representations and obtain concise formula that characterize their performance. Surprisingly, we show that a linear ESN is equivalent to ridge regression with an exponentially time-weighted (’‘memory’’) input covariance, revealing a clear inductive bias toward recent inputs. Experiments match predictions: ESNs win in low-sample, short-memory regimes, while ridge prevails with more data or long-range dependencies. Our methodology provides a general framework for analyzing overparameterized models and offers insights into the behavior of deep learning networks.


Subjects:	Statistics Theory (math.ST); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2511.02401 [math.ST]
	(or arXiv:2511.02401v1 [math.ST] for this version)
	https://doi.org/10.48550/arXiv.2511.02401 arXiv-issued DOI via DataCite

Submission history

From: Yessin Moakher [view email] [via CCSD proxy] [v1] Tue, 4 Nov 2025 09:30:31 UTC (2,193 KB)

Submission history

Similar Posts