Statistics and Its Interface
Volume 4 (2011)
A clustered optimal ROC curve method for family-based genetic risk prediction
Pages: 373 – 380
Risk prediction that capitalizes on emerging genetic findings holds great promises for improving public health and clinical care. Statistical methods for genetic risk prediction research, and particularly for correlated data, are however still lacking. To address this, we have developed a clustered optimal ROC curve (CORC) method, in order to build predictive genetic tests using data from family-based genetic research. For the proposed method, we have extended the conventional optimal ROC curve method to handle multiple genetic markers, taking sample correlation into consideration, and implemented a forward selection algorithm to allow for high-dimensional data and the capture of possible epistasis. We have evaluated the CORC method using both simulations and a real-data application, showing that the method performed better than other existing methods under various pedigree structures and underlying disease models. In the real-data application, we applied the method to the large scale International Multi-Center ADHD Genetics Project dataset and formed a predictive genetic test for conduct disorder. The test reached a low to medium classification accuracy, with an AUC value of 0.6908.
clustered ROC curve, predictive genetic test, high-dimensional data, genome-wide association study