Statistics and Its Interface

Volume 5 (2012)

Number 1

Permutation methods for testing the significance of phosphorylation motifs

Pages: 61 – 73

DOI: https://dx.doi.org/10.4310/SII.2012.v5.n1.a6

Authors

Haipeng Gong (School of Software, Dalian University of Technology, Dalian City, Liaoning, China)

Zengyou He (School of Software, Dalian University of Technology, Dalian City, Liaoning, China)

Abstract

Phosphorylation motifs represent common patterns around the phosphorylation site. As the discovery of such kinds of motifs reveals the underlying regulation mechanism and facilitates the prediction of unknown phosphorylation events, some phosphorylation motif discovery methods are proposed. Existing methods include Motif-X, MoDL, and Motif-All. Each of these methods can find a certain number of motifs, however, there are still no theoretically guided measures to select true phosphorylation motifs from false ones. Since it is very expensive and time-consuming to perform the biological validation on all reported motifs, the use of effective statistical methods as a preliminary filter to remove non-significant motifs is actually needed. To solve this problem, we use permutation to calculate p-values of identified motifs and thus their statistical significance can be assessed accurately. We suggest to utilize three permutation methods: the Standard Permutation (SP), the Adaptive Marginal Effect Permutation (AMEP) and the Modified Adaptive Marginal Effect Permutation (MAMEP). We conduct comprehensive experimental studies to demonstrate the effectiveness of our methods. Experimental results on real data and simulation studies show that all permutation methods are capable of removing potential false positives. Particularly, both AMEP and MAMEP are of practical use and can satisfy different requirements of biological researchers.

Keywords

phosphorylation motif, frequent-pattern mining, permutation test

Published 17 February 2012