The recent emergence of large-scale biomedical data presents exciting opportunities for scientific discovery. However, the extreme dimensionality of the data and non-negligible measurement errors can make estimation difficult. Methods for high-dimensional covariates with measurement error are limited. It usually requires knowledge of the noise distribution and should focus on linear or generalized linear models. In this work, we develop high-dimensional measurement error models for a class of Lipschitz loss functions, including logistic regression, hinge loss, quantile regression, and others. Our estimator is designed to minimize the $L_1$ norm among all estimators belonging to a good feasible set without requiring knowledge of the noise distribution. We then generalize these estimators to a Lasso analogue version that is computationally scalable to higher dimensions. We derive theoretical guarantees regarding statistical error bounds and sign consistency for finite samples, even when the dimension grows exponentially with sample size. Large-scale simulation studies demonstrate superior performance compared to existing methods on classification and quantile regression problems. The application of the Human Connectome Project data to a sex classification task based on functional brain connectivity will improve accuracy under our approach and enhance our ability to reliably identify key brain connections that drive gender differences. is showing.

Source link


Leave A Reply