Statistics and Its Interface

Volume 17 (2024)

Number 1

Special issue in honor of Professor Lincheng Zhao

Latent class proportional hazards regression with heterogeneous survival data

Pages: 79 – 90



Teng Fei (Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, N.Y., U.S.A.)

John J. Hanfelt (Department of Biostatistics and Bioinformatics, Emory University, Atlanta, Georgia, U.S.A.)

Limin Peng (Department of Biostatistics and Bioinformatics, Emory University, Atlanta, Georgia, U.S.A.)


Heterogeneous survival data are commonly present in chronic disease studies. Delineating meaningful disease subtypes directly linked to a survival outcome can generate useful scientific implications. In this work, we develop a latent class proportional hazards (PH) regression framework to address such an interest. We propose mixture proportional hazards modeling, which flexibly accommodates class-specific covariate effects while allowing for the baseline hazard function to vary across latent classes. Adapting the strategy of nonparametric maximum likelihood estimation, we derive an Expectation-Maximization (E‑M) algorithm to estimate the proposed model. We establish the theoretical properties of the resulting estimators. Extensive simulation studies are conducted, demonstrating satisfactory finite-sample performance of the proposed method as well as the predictive benefit from accounting for the heterogeneity across latent classes. We further illustrate the practical utility of the proposed method through an application to a mild cognitive impairment (MCI) cohort in the Uniform Data Set.


finite mixture model, latent class analysis, non-parametric maximum likelihood estimator, proportional hazards regression

2010 Mathematics Subject Classification

62P10, 62N01, 62N02

Received 5 September 2022

Accepted 14 February 2023

Published 27 November 2023