A Hierarchical Classification for the Selection of the Most Suitable Multiple Sequence Alignment Methodology

Francisco   M.   Ortuño; Hector      Pomares; Olga      Valenzuela; Carolina      Torres; Ignacio      Rojas

doi:10.2174/157489361002150518145112

Abstract

Multiple sequence alignments (MSAs) are currently one of the most powerful procedure in bioinformatics in order to provide additional information useful to other understanding techniques such as biological function analyses, structure predictions or next-generation sequencing. Nevertheless, current MSA methodologies are providing quite different alignments for the same set of sequences depending on some particular biological features of these sequences. For this reason, the selection of a suitable tool for aligning a specific set of sequences is an important task which has not been totally solved yet. In this work, we propose a hierarchical algorithm of several binary classifiers based on support vector machines (SVMs) to predict "a priori" the MSA tool which will provide the most accurate alignment. Firstly, a set of heterogeneous biological features related to each set of sequences are retrieved from well-known databases. Subsequently, those most significant features according to each specific aligner are included in this particular classifier. Finally, the SVM classifiers are joined to decide the most suitable method according to the quality of each classification. This procedure was assessed by the benchmark BAliBASE v3.0 and compared against other similar tools, namely AlexSys and PAcAlCI.

Keywords: Feature extraction, feature selection, machine learning, multiple sequence alignments (MSAs), support vector machine (SVM).

« Previous Next »

Graphical Abstract

Rights & Permissions Print Cite

Article Metrics

18

2

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/157489361002150518145112	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X

Current Bioinformatics

A Hierarchical Classification for the Selection of the Most Suitable Multiple Sequence Alignment Methodology

Abstract Play Pause

Graphical Abstract

Related Journals

Related Books

Abstract