Statistics and Its Interface

Volume 17 (2024)

Number 3

Towards better clinical prediction and interpretation via a new imputation strategy from National Health and Nutrition Examination Survey

Pages: 425 – 437

DOI: https://dx.doi.org/10.4310/23-SII783

Authors

Zhe Gao (Sun Yat-Sen University)

Yanfang Xu (Sun Yat-Sen University)

Wangwei Wu (Sun Yat-Sen University)

Xueqin Wang (University of Science and Technology of China)

YinYing Kong (Guangdong University of Finance and Economics)

Ting Tian (Sun Yat-Sen University)

Abstract

Heart failure, the most prevalent reason for hospitalization and readmission among the elderly, is a common, expensive, and potentially fatal condition brought on by impairment of the heart’s function to pump blood. Finding the underlying causes is essential to diagnosis and treatment because heart failure is a syndrome and the potential final stage of all heart illnesses. In the observational study, missing values are frequent and challenging. We propose an efficient and interpretable framework to analyze missing data based on the matrix completion method and logistic regression model. A simple rank estimation procedure is also proposed in our approach to determine hyperparameters in matrix completion and provide interpretability for the number of important variables or risk factors. We conduct a case study with the National Health and Nutrition Examination Survey data on heart failure (2007–2014) to explore new risk factors. Each participant had variables with missing values, resulting in the total missing rate being 47.73%. Our method has been shown to improve AUC in a case study. Even more, total spine bone mineral content (BMC) is identified as a potential risk factor associated with heart failure.

Keywords

heart failure, missing values, rank, matrix completion, logistic regression, total spine bone mineral content

2010 Mathematics Subject Classification

Primary 62Dxx. Secondary 15A83, 62J12.

Received 19 November 2022

Accepted 9 February 2023

Published 19 July 2024