Statistics and Its Interface
Volume 15 (2022)
Empirical likelihood-based estimation and inference in randomized controlled trials with high-dimensional covariates
Pages: 283 – 301
In this paper, we propose a data-adaptive empirical likelihood-based approach for treatment effect estimation and inference, which overcomes the obstacle of the traditional empirical likelihood-based approaches in the high-dimensional setting by adopting penalized regression and machine learning methods to model the covariate-outcome relationship. In particular, we show that our procedure successfully recovers the true variance of Zhang’s treatment effect estimator  by utilizing a data-splitting technique. Our proposed estimator is proved to be asymptotically normal and semiparametric efficient under mild regularity conditions. Simulation studies indicate that our estimator is more efficient than the estimator proposed by Wager et al.  when random forest is employed to model the covariate-outcome relationship. Moreover, when multiple machine learning models are imposed, our estimator is at least as efficient as any regular estimator with a single machine learning model. We compare our method to existing ones using the ACTG175 data and the GSE118657 data, and confirm the outstanding performance of our approach.
average treatment effect, datasplitting, machine learning, multiple robustness, semiparametric efficiency bound
Ying Yan’s research is supported by the National Natural Science Foundation of China (NSFC) (Grant No. 11901599).
Received 5 October 2020
Accepted 16 June 2021
Published 14 February 2022