Statistics and Its Interface

Volume 11 (2018)

Number 3

Double sparsity kernel learning with automatic variable selection and data extraction

Pages: 401 – 420

DOI: https://dx.doi.org/10.4310/SII.2018.v11.n3.a1

Authors

Jingxiang Chen (Department of Biostatistics, University of North Carolina, Chapel Hill, N.C., U.S.A.)

Chong Zhang (Department of Statistics and Actuarial Science, University of Waterloo, Ontario, Canada)

Michael R. Kosorok (Department of Biostatistics, University of North Carolina, Chapel Hill, N.C., U.S.A.)

Yufeng Liu (Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, N.C., U.S.A.)

Abstract

Learning in the Reproducing Kernel Hilbert Space (RKHS) has been widely used in many scientific disciplines. Because a RKHS can be very flexible, it is common to impose a regularization term in the optimization to prevent overfitting. Standard RKHS learning employs the squared norm penalty of the learning function. Despite its success, many challenges remain. In particular, one cannot directly use the squared norm penalty for variable selection or data extraction. Therefore, when there exists noise predictors, or the underlying function has a sparse representation in the dual space, the performance of standard RKHS learning can be suboptimal. In the literature, work has been proposed on how to perform variable selection in RKHS learning, and a data sparsity constraint was considered for data extraction. However, how to learn in a RKHS with both variable selection and data extraction simultaneously remains unclear. In this paper, we propose a unified RKHS learning method, namely, DOuble Sparsity Kernel (DOSK) learning, to overcome this challenge. An efficient algorithm is provided to solve the corresponding optimization problem. We prove that under certain conditions, our new method can asymptotically achieve variable selection consistency. Simulated and real data results demonstrate that DOSK is highly competitive among existing approaches for RKHS learning.

Keywords

data extraction, kernel classification, kernel regression, reproducing kernel Hilbert space, selection consistency, variable selection

Received 13 May 2017

Published 17 September 2018