Statistics and Its Interface

Volume 8 (2015)

Number 4

Rare variant testing across methods and thresholds using the multi-kernel sequence kernel association test (MK-SKAT)

Pages: 495 – 505

DOI: https://dx.doi.org/10.4310/SII.2015.v8.n4.a8

Authors

Eugene Urrutia (Department of Biostatistics, University of North Carolina, Chapel Hill, N.C., U.S.A.)

Seunggeun Lee (Department of Biostatistics, University of Michigan, Ann Arbor, Mich., U.S.A.)

Arnab Maity (Department of Statistics, North Carolina State University, Raleigh, N.C., U.S.A.)

Ni Zhao (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, U.S.A.)

Judong Shen (Quantitative Sciences, R&D, GlaxoSmithKline, Research Triangle Park, North Carolina, U.S.A.)

Yun Li (Department of Genetics and Department of Biostatistics, University of North Carolina, Chapel Hill, N.C., U.S.A.)

Michael C. Wu (Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, U.S.A.)

Abstract

Analysis of rare genetic variants has focused on regionbased analysis wherein a subset of the variants within a genomic region is tested for association with a complex trait. Two important practical challenges have emerged. First, it is difficult to choose which test to use. Second, it is unclear which group of variants within a region should be tested. Both depend on the unknown true state of nature. Therefore, we develop the Multi-Kernel SKAT (MK-SKAT) which tests across a range of rare variant tests and groupings. Specifically, we demonstrate that several popular rare variant tests are special cases of the sequence kernel association test which compares pair-wise similarity in trait value to similarity in the rare variant genotypes between subjects as measured through a kernel function. Choosing a particular test is equivalent to choosing a kernel. Similarly, choosing which group of variants to test also reduces to choosing a kernel. Thus, MK-SKAT uses perturbation to test across a range of kernels. Simulations and real data analyses show that our framework controls type I error while maintaining high power across settings: MK-SKAT loses power when compared to the kernel for a particular scenario but has much greater power than poor choices.

Keywords

rare variants, perturbation, sequence kernel association test, sequencing association studies

Published 19 October 2015