[Submitted on 14 Oct 2022]

Download PDF

Overview: Increasingly high-dimensional datasets require that estimation methods not only meet statistical guarantees, but also be computationally viable. In this context, we consider $L^{2}$-boosting by orthogonal matching pursuit in high-dimensional linear models and analyze the algorithm’s data-driven early stopping time $\tau$. Its calculation is based only on the first $ \tau $ iteration. This approach is much less costly than established model selection criteria, which require computation of the full boosting path. In this setting, we prove that sequential early stopping maintains statistical optimality with respect to the fully general Oracle inequality for empirical risk and the recently established optimal rate of convergence for population risk. Finally, large-scale simulation studies show that the performance of these types of methods is comparable to other state-of-the-art algorithms such as cross-validated Lasso and model selection at significantly reduced computational costs. shown. Full boost pass.

Submission history

From: Bernhard Stankewitz [view email]


Friday, October 14, 2022 14:23:40 UTC (1,114 KB)

Source link


Leave A Reply