Abstract
Recent works have developed model complexity based and algorithm based generalization error bounds to explain how stochastic gradient descent (SGD) methods help over-parameterized models generalize better. However, previous works are limited by their scope of analysis and fail to provide comprehensive explanations. In this paper, we propose a novel Gaussian approximation framework to establish generalization error bounds for the đ°-SGD family, which is a class of SGD with asymptotically unbiased and uniformly bounded gradient noise. We study đ°-SGD dynamics, and we show both theoretically and numerically that the limiting model parameter distribution tends to be Gaussian, even when the original gradient noise is non-Gaussian. For a đ°-SGD family, we establish a desirable iteration number independent generalization error bound at the order of