Generalization Below the Edge of Stability: The Role of Data Geometry
arxiv.org·2w
Flag this post

Title:Generalization Below the Edge of Stability: The Role of Data Geometry

View PDF HTML (experimental)

Abstract:Understanding generalization in overparameterized neural networks hinges on the interplay between the data geometry, neural architecture, and training dynamics. In this paper, we theoretically explore how data geometry controls this implicit bias. This paper presents theoretical results for overparameterized two-layer ReLU networks trained below the edge of stability. First, for data distributions supported on a mixture of low-dimensional balls, we derive generalization bounds that provably adapt to the intrinsic dimension. Second, for a family of isotropic distributions that vary in how str…

Similar Posts

Loading similar posts...