Outliers are prevalent in big data applications and can severely impact statistical estimation and inference. In this paper, we introduce an outlier-resistant estimation framework to make an arbitrarily given loss function robust. This is closely related to the trimming method and includes explicit outlier parameters for all samples, facilitating computation, theory, and parameter tuning. We develop scalable algorithms that are easy to implement and guarantee fast convergence to tackle non-convexity and non-smoothness problems. In particular, for regular datasets, new techniques are proposed to relax the starting point requirements so that the number of data resamplings can be significantly reduced. Non-asymptotic analysis beyond M-estimation can be performed based on a combination of statistical and computational work. The resulting resistance estimators are not necessarily globally or locally optimal, but enjoy minimaxrate optimality in both low and high dimensions. Regression, classification, and neural network experiments show excellent performance of the proposed methodology in the presence of total outliers.

Source link


Leave A Reply