Statistics and Its Interface

Volume 3 (2010)

Number 2

Nonparametric tests for longitudinal DNA copy number data

Pages: 211 – 221

DOI: https://dx.doi.org/10.4310/SII.2010.v3.n2.a8

Authors

Haiyan Wang (Department of Statistics, Kansas State University, Manhattan, Ks., U.S.A.)

Ke Zhang (Department of Pathology, School of Medicine and Health Sciences, University of North Dakota, Grand Forks, N.D., U.S.A.)

Abstract

Array comparative genomic hybridization (aCGH) and single nucleotide polymorphism (SNP) array data are becoming commonly available for scientists to study genetic mechanisms involved in complex biological processes. Such data typically contain a large number of probes observed repeatedly over time. Due to cost concerns, the number of replicates is often very limited. Effective hypothesis testing tools need to take into account the high dimensionality and small sample sizes. In this paper, we present a set of nonparametric hypothesis testing theory to test for main and interaction effects related to a large number of probes for longitudinal DNA copy number data from aCGH or SNP arrays. The asymptotic distributions of the test statistics are obtained under a realistic model setup that allows distribution-free robust inference in presence of temporal correlations for heteroscedastic high dimensional low sample size data. They provide a flexible tool for a wide range of scientists to accelerate novel gene discovery such as identification of genome regions of aberration to control tumor progression. Simulations and applications of the new methods to DNA copy number aberration from Wilm’s tumor relapse study are presented.

Keywords

repeated measures, nonparametric statistics, hypothesis testing, DNA copy number aberration, high dimensional data analysis

2010 Mathematics Subject Classification

Primary 62P10. Secondary 62G10, 62G35.

Published 1 January 2010