Statistics and Its Interface

Volume 3 (2010)

Number 4

Variance model selection with application to joint analysis of multiple microarray datasets under false discovery rate control

Pages: 477 – 491

DOI: https://dx.doi.org/10.4310/SII.2010.v3.n4.a6

Authors

Nicola Bacciu (INRA, GARen, Agrocampus, Rennes, France)

Jack C. M. Dekkers (Department of Animal Science, Iowa State University, Ames, Ia., U.S.A.)

Dan Nettleton (Department of Statistics, Iowa State University, Ames, Ia., U.S.A.)

Long Qu (Department of Statistics and Department of Animal Science, Iowa State University, Ames, Ia., U.S.A.)

Abstract

We study the problem of selecting homogeneous variance models vs. heterogeneous variance models in the context of joint analysis of multiple microarray datasets. We provide a modified multiresponse permutation procedure (MRPP), modified cross-validation procedures, and the right AICc (corrected Akaike’s information criterion) for choosing a variance model. In a simple univariate setting, our modified MRPP outperforms commonly used competitors. For microarray data analysis, we suggest using the sum of genespecific selection criteria to choose one best gene-specific model for use with all genes. Through realistic simulations based on three real microarray studies, we evaluated the proposed methods and found that using the correct model does not necessarily provide the best separation between differentially and equivalently expressed genes, but it does control false discovery rates (FDR) at desired levels. A hybrid procedure to decouple FDR control and differential expression detection is recommended.

Keywords

AIC, AICc, cross-validation, false discovery rates, microarray, model selection, multiresponse permutation procedure, variance model

2010 Mathematics Subject Classification

Primary 62F07, 62J20. Secondary 62P10, 92C40.

Published 1 January 2010