Communications in Information and Systems

Volume 18 (2018)

Number 3

Similarity analysis of protein sequences based on a new graphical representation method

Pages: 193 – 208

DOI: https://dx.doi.org/10.4310/CIS.2018.v18.n3.a4

Authors

Yuyan Zhang (School of Agriculture and Hydraulic Engineering, Suihua University, Suihua, China)

Jia Wen (School of Information Engineering, Suihua University, Suihua, China)

Abstract

Similarity analysis of protein sequences is often utilized to identify the similarities/dissimilarities of protein sequences, which is the key step to predict the structures and functions of the newly identified proteins. Integrating the properties of isoelectric point and hydrophobic factors for amino acids, a new graphical representation of protein sequence is proposed to depict the features of proteins, in which both the local and global information of protein sequence are shown. Our new graphical curve has no degeneracy or arbitrariness, and the relationship between a protein sequence and its corresponding graphical curve is one-to-one. In addition, two numerical characterizations derived from the protein graph are utilized to quantify each protein sequence. The examination of similarity of the DN6 proteins from eight different species shows the utility of our new method.

Keywords

protein sequence, relative distance, geometrical center, leading eigenvalue, similarity analysis

This paper is supported by Youth Funding of Suihua University (K1501006), Scientific Research Funding of Suihua University (K1501009, 2017-XGYYWF-017), Scientific Research Funding of Heilongjiang Education Department (2017-KYYWF-0721), and Science and Technology Bureau of Suihua (SHKJ2015-020).

Published 22 October 2018