FermatS: A Novel Numerical Representation for Protein Sequence Comparison and DNA-binding Protein Identification

Yanping       Zhang; Ya       Gao; Jianwei       Ni; Pengcheng       Chen; Xiaosheng       Wang

doi:10.2174/1386207323999201117111738

Abstract

Aims: Based on protein sequence information, a simple and effective method was used to analyze protein sequence similarity and predict DNA-binding protein.

Background: It is absolutely necessary that we generate computational methods of low complexity to accurate infer protein structure, function, and evolution in the rapidly growing number of molecular biology data available.

Objective: It is important to generate novel computational algorithms for analyzing and comparing protein sequences with the rapidly growing number of molecular biology data available.

Methods: Based on global and local position representation with the curves of Fermat spiral and normalized moments of inertia of the curve of Fermat spiral, respectively, moreover, composition of 20 amino acids to get the numerical characteristics of protein sequences.

Results: It has been applied to analyze the similarity/dissimilarity of nine ND5 proteins, the analysis results are consistent with the biological evolution theory. Furthermore, we employ the Logistic regression with 5-fold cross-validation to establish the prediction of DNA-binding proteins model, which outperformed the DNAbinder, iDNA-prot, DNA-prot and gDNA-prot by 0.0069-0.609 in terms of F-measure, 0.293-0.898 in terms of MCC in unbalanced dataset.

Conclusion: These results show that our method, namely FermatS, is effective to compare, recognition and prediction the protein sequences.

Keywords: Fermat spiral, mass, moment of inertia, similarity/dissimilarity of species, identification of DNA-binding proteins, logistic regression.

« Previous Next »

Rights & Permissions Print Cite

Article Metrics

16

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1386207323999201117111738	Print ISSN 1386-2073
Publisher Name Bentham Science Publisher	Online ISSN 1875-5402

Combinatorial Chemistry & High Throughput Screening

FermatS: A Novel Numerical Representation for Protein Sequence Comparison and DNA-binding Protein Identification

Abstract Play Pause

Related Journals

Abstract