Statistics and Its Interface

Volume 15 (2022)

Number 4

Partial profile score feature selection in high-dimensional generalized linear interaction models

Pages: 433 – 447

DOI: https://dx.doi.org/10.4310/21-SII706

Authors

Zengchao Xu (School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, China)

Shan Luo (School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, China)

Zehua Chen (Department of Statistics and Applied Probability, National University of Singapore)

Abstract

Sequential method is promising for feature selection in high-dimensional models. In this paper, we propose a sequential approach based on partial profile score dubbed as PPSFS to feature selection for a broad class of high-dimensional models, including high-dimensional generalized linear interaction models. The PPSFS approach has a prominent performance in feature selection while it keeps highly scalable for ultra-high-dimensional models. The selection consistency of the PPSFS approach is established under mild conditions. Comprehensive numerical studies demonstrating the performance of PPSFS are reported. A real data analysis for gene expression cancer RNA-Seq data is also presented.

Keywords

feature selection, partial profile score, high-dimensional, interaction model, sequential procedure

2010 Mathematics Subject Classification

Primary 62F99. Secondary 62J12.

Received 7 June 2021

Accepted 13 October 2021

Published 4 March 2022