Abstract
Classifying sequences is one of the central problems in computational biosciences. Several tools have been released to map an unknown molecular entity to one of the known classes using solely its sequence data. However, all of the existing tools are problem-specific and restricted to an alphabet constrained by relevant biological structure. Here, we introduce TRAINER, a new online tool designed to serve as a generic sequence classification platform to enable users provide their own training data with any alphabet therein defined. TRAINER allows users to select among several feature representation schemes and supervised machine learning methods with relevant parameters. Trained models can be saved for future use without retraining by other users. Two case studies are reported for effective use of the system for DNA and protein sequences; candidate effector prediction and nucleolar localization signal prediction. Biological relevance of the results is discussed.
Keywords: Sequence classification, web server, k-nearest neighbors, naive Bayes classifier, support vector machine.
Protein & Peptide Letters
Title:TRAINER: A General-Purpose Trainable Short Biosequence Classifer
Volume: 20 Issue: 10
Author(s): Hasan Ogul, Alper T. Kalkan, Sinan U. Umu and Mahinur S. Akkaya
Affiliation:
Keywords: Sequence classification, web server, k-nearest neighbors, naive Bayes classifier, support vector machine.
Abstract: Classifying sequences is one of the central problems in computational biosciences. Several tools have been released to map an unknown molecular entity to one of the known classes using solely its sequence data. However, all of the existing tools are problem-specific and restricted to an alphabet constrained by relevant biological structure. Here, we introduce TRAINER, a new online tool designed to serve as a generic sequence classification platform to enable users provide their own training data with any alphabet therein defined. TRAINER allows users to select among several feature representation schemes and supervised machine learning methods with relevant parameters. Trained models can be saved for future use without retraining by other users. Two case studies are reported for effective use of the system for DNA and protein sequences; candidate effector prediction and nucleolar localization signal prediction. Biological relevance of the results is discussed.
Export Options
About this article
Cite this article as:
Ogul Hasan, Kalkan T. Alper, Umu U. Sinan and Akkaya S. Mahinur, TRAINER: A General-Purpose Trainable Short Biosequence Classifer, Protein & Peptide Letters 2013; 20 (10) . https://dx.doi.org/10.2174/0929866511320100004
DOI https://dx.doi.org/10.2174/0929866511320100004 |
Print ISSN 0929-8665 |
Publisher Name Bentham Science Publisher |
Online ISSN 1875-5305 |

- Author Guidelines
- Bentham Author Support Services (BASS)
- Graphical Abstracts
- Fabricating and Stating False Information
- Research Misconduct
- Post Publication Discussions and Corrections
- Publishing Ethics and Rectitude
- Increase Visibility of Your Article
- Archiving Policies
- Peer Review Workflow
- Order Your Article Before Print
- Promote Your Article
- Manuscript Transfer Facility
- Editorial Policies
- Allegations from Whistleblowers