Statistics and Its Interface

Volume 6 (2013)

Number 3

Estimation and imputation in linear regression with missing values in both response and covariate

Pages: 361 – 368

DOI: https://dx.doi.org/10.4310/SII.2013.v6.n3.a6

Author

Jun Shao (School of Finance and Statistics, East China Normal University, Shanghai, China; Department of Statistics, University of Wisconsin, Madison, Wisc., U.S.A.)

Abstract

We consider linear regression with missing responses as well as missing covariate data. When the missing data mechanism is ignorable, we show that regression parameters and the response mean can be estimated using standard methods and treating imputed values as observed data. We also show that the same procedure results in biased and inconsistent estimators when missing response mechanism depends on covariates that also have missing values and thus is nonignorable. Efficient estimation and imputation under nonignorable missingness is a challenge problem. Under some conditions, we derive some asymptotically unbiased and consistent estimators via direct estimation or imputation. Some simulation results are presented to examine the finite sample performance of various estimators.

Keywords

asymptotic unbiasedness and consistency, imputation, linear regression, missing covariate data, missing response data, nonignorable missingness

2010 Mathematics Subject Classification

Primary 62J05. Secondary 62G20.

Published 22 August 2013