Contents Online
Statistics and Its Interface
Volume 14 (2021)
Number 4
Feature screening via Bergsma–Dassios sign correlation learning
Pages: 417 – 430
DOI: https://dx.doi.org/10.4310/20-SII662
Authors
Abstract
Robust rank correlation screening (RRCS) procedure that is built on Kendall $\tau$, has been suggested by Li, Peng, Zhang and Zhu (2012) as a robust alternative to the sure independence screening (SIS) method that is based on the Pearson’s correlation. However, as a drawback for certain applications is that $\tau$ may be zero even if there is an association between two random variables, RRCS is not omnibus, only having an ability to detect monotonic effects. In this paper, we use the Bergsma–Dassios sign correlation (Bergsma and Dassios, 2014, $\tau^\ast_b$) to introduce a new SIS procedure.We advocate using the $\tau^\ast_b$‑SIS for three reasons. First, as $\tau^\ast_b$ possesses the necessary and intuitive properties as a correlation index, the $\tau^\ast_b$‑SIS has a better screening ability for nonlinear effects including interactions and heterogeneity compared with the RRCS. Second, as $\tau^\ast_b$ is a natural extension of $\tau$, the $\tau^\ast_b$‑SIS is conceptually simple, easy to implement and robust to the presence of extreme values and outliers in the observations. Third, without assuming any moment condition on the response and predictors, the $\tau^\ast_b$‑SIS enjoys several appealing properties, such as the sure screening property, ranking consistency property and the characteristic of minimum model size. We demonstrate the merits of the $\tau^\ast_b$‑SIS procedure through extensive Monte Carlo experiments and illustrate the method through a real-data example.
Keywords
Bergsma–Dassios sign correlation, feature screening, Kendall $\tau$, sure screening property, ranking consistency property, minimum model size
He’s work is supported by the National Natural Science Foundation of China (Grant No. 11201005), the Humanities and Social Sciences Foundation of Ministry of Education, China (Grant No. 17YJC910003) and the Natural Science Foundation of Anhui Province (Grant No. 2008085MA08).
Kai Xu’s work is supported by the National Natural Science Foundation of China (Grant No. 11901006) and the Natural Science Foundation of Anhui Province (Grant No. 1908085QA06).
Lei He’s work is supported by the Natural Science Foundation of Anhui Province (Grant No. 2008085QA15).
Received 20 April 2020
Accepted 27 December 2020
Published 8 July 2021