Abstract
The field of proteomics has grown vertiginously in the last years. This has been due fundamentally to technological improvements in the instrumentation, methods, and easy-to-use software, thereby making it possible to address a large number of biological questions and to deepen the study of the proteome of several organisms. The development in the field has imposed a challenge in the computational analysis of the commonly obtained large datasets generated in a single proteomics experiment, which still remains. An alternative to tackle this general issue has been the use of auxiliary information generated during the proteomics experiment to validate the confidence of the identifications. In this manuscript we review the main molecular descriptors used for building predictor models for estimating retention time, isoelectric point and peptide “detectability”, which are key tools in the design of several validation strategies based in these criteria. We also give an overview of the main open source tools and libraries used for computing molecular descriptors.
Keywords: Proteomics, molecular descriptor, retention time, isoelectric point, proteotypic peptide, bioinformatics tools, support vector machine, cheminformatics.