Statistics and Its Interface
Volume 12 (2019)
Model-free conditional feature screening with exposure variables
Pages: 239 – 251
In high dimensional analysis, effects of explanatory variables on responses sometimes rely on certain exposure variables, such as time or environmental factors. In this paper, to characterize the importance of each predictor, we utilize its conditional correlation given exposure variables with the empirical distribution function of response. A modelfree conditional screening method is subsequently advocated based on this idea, aiming to identify significant predictors whose effects may vary with the exposure variables. The proposed screening procedure is applicable to any model form, including that with heteroscedasticity where the variance component may also vary with exposure variables. It is also robust to extreme values or outlier. Under some mild conditions, we establish the desirable sure screening and the ranking consistency properties of the screening method. The finite sample performances are illustrated by simulation studies and an application to the breast cancer dataset.
conditional screening, feature screening, exposure variable, model-free, sure screening property, variable selection
The research of Jingyuan Liu is supported in part by National Natural Science Foundation of China (NSFC, 11771361), JAS14007, and Fundamental Research Funds for the Scientific Research Foundation for the Returned Overseas Chinese Scholars.
The research of Liping Zhu is supported in part by National Natural Science Foundation of China (NSFC, 11371236, 11422107), and Henry Fok Education Foundation Fund of Young College Teachers (141002).
Received 10 April 2018
Published 11 March 2019