Statistics and Its Interface

Volume 7 (2014)

Number 4

Special Issue on Modern Bayesian Statistics (Part I)

Guest Editor: Ming-Hui Chen (University of Connecticut)

Nonparametric Bayesian functional clustering for time-course microarray data

Pages: 543 – 557

DOI: https://dx.doi.org/10.4310/SII.2014.v7.n4.a10

Authors

Ziwen Wei (Merck Pharmaceuticals, Rahway, New Jersey, U.S.A.)

Lynn Kuo (Department of Statistics, University of Connecticut, Storrs, Conn., U.S.A.)

Abstract

Time-course microarray experiments track gene expression levels across several time points. They provide valuable insights into genome-wide dynamic aspects of gene regulations. We focus on gene clustering analysis in this paper. We explore a nonparametric Bayesian method for constructing clusters in functional space from the characteristics of gene profiles. In particular, we model each gene profile using a B-spline basis. So each gene is characterized by the basis coefficients of the spline fitting. Then we place a Dirichlet process prior on the basis coefficients to determine clusters of the genes. We essentially construct a hierarchical Dirichlet processes mixing model that assigns genes into the same cluster if they share the same latent basis coefficients. A simulation study is conducted to compare the proposed method to the K-means clustering method, a model-based clustering method (MCLUST), and a two-stage version of them in terms of the adjusted Rand index. We show our new method has better adjusted Rand index number among all these methods. We apply this nonparametric Bayesian clustering method to a real data set with 6 time points to gain further insights into how genes with similar profiles are clustered together and we find their functional annotation in Gene-Ontology groups using GOstats.

Keywords

Dirichlet process, time-course microarray, functional data analysis

Published 23 December 2014