One of the major open problems in machine learning is characterizing generalizations in overparameterized regimes. In this case, the bounds of most traditional generalizations become inconsistent (Nagarajan and Kolter, 2019). In many scenarios, their failure can result from obscuring important interactions between the training algorithm and the underlying data distribution. To address this issue, I propose a concept named Compatibility. It quantitatively characterizes the generalization in both data-related and algorithm-related ways. By considering the entire training trajectory and focusing on early stopping iterations, compatibility becomes a better concept for generalization as it leverages data and algorithmic information. We verify this by theoretically studying compatibility in the setting of solving an overparameterized linear regression using gradient descent. Specifically, we perform data-dependent trajectory analysis to derive sufficient conditions for compatibility in such settings. Our theoretical results show that the generalization holds, in a compatibility sense, with significantly weaker restrictions than the previous last iterative analysis on the problem instance.

    Source link


    Leave A Reply