Abstract
Background: In the last years, similarity searching has gained wide popularity as a method for performing Ligand-Based Virtual Screening (LBVS). This screening technique functions by making a comparison of the target compound’s features with that of each compound in the database of compounds. It is well known that none of the individual similarity measures could provide the best performances each time pertaining to an active compound structure, representing all types of activity classes. In the literature, we find several techniques and strategies that have been proposed to improve the overall effectiveness of ligand-based virtual screening approaches.
Objective: In this work, our main objective is to propose a features selection approach based on genetic algorithm (FSGASS) to improve similarity searching pertaining to ligand-based virtual screening.
Methods: Our contribution allows us to identify the most important and relevant characteristics of chemical compounds and to minimize their number in their representations. This will allow the reduction of features space, the elimination of redundancy, the reduction of training execution time, and the increase of the performance of the screening process.
Results: The obtained results demonstrate superiority in the performance compared with these obtained with Tanimoto coefficient, which is considered as the most widely coefficient to quantify the similarity in the domain of LBVS.
Conclusion: Our results show that significant improvements can be obtained by using molecular similarity research methods at the basis of features selection.
Keywords: Feature selection, genetic algorithm, ligand-based virtual screening, similarity searching, similarity coefficients, molecular descriptors, drug discovery.
Graphical Abstract
[http://dx.doi.org/10.3797/scipharm.0803-03]
[http://dx.doi.org/10.5936/csbj.201302002] [PMID: 24688695]
[PMID: 15338948]
[http://dx.doi.org/10.1021/ci9800211]
[http://dx.doi.org/10.1038/nrd1347] [PMID: 15060531]
[http://dx.doi.org/10.3390/molecules201018107] [PMID: 26445039]
[http://dx.doi.org/10.1016/S1359-6446(97)01163-X]
[http://dx.doi.org/10.1007/s11030-006-8697-1] [PMID: 16404528]
[http://dx.doi.org/10.3844/ajassp.2011.368.373]
[http://dx.doi.org/10.1016/j.ins.2014.05.042]
[http://dx.doi.org/10.3837/tiis.2008.04.002]
[http://dx.doi.org/10.1021/ci300261r] [PMID: 23078167]
[http://dx.doi.org/10.1021/ci025591m] [PMID: 12653508]
[http://dx.doi.org/10.1504/IJISTA.2019.10021692]
[http://dx.doi.org/10.1016/j.ymeth.2014.08.005] [PMID: 25132639]
[http://dx.doi.org/10.1021/jm050316n] [PMID: 16250664]
[http://dx.doi.org/10.1016/j.drudis.2006.10.005] [PMID: 17129822]
[http://dx.doi.org/10.1002/minf.201000050] [PMID: 27463331]
[http://dx.doi.org/10.1021/ci0340916] [PMID: 14632457]
[http://dx.doi.org/10.1002/cmdc.200800290] [PMID: 19072820]
[http://dx.doi.org/10.1007/s10822-012-9543-4] [PMID: 22249773]
[http://dx.doi.org/10.1100/2012/410914] [PMID: 22623895]
[http://dx.doi.org/10.1007/s11030-014-9545-3] [PMID: 25182364]
[http://dx.doi.org/10.1007/s10822-016-0003-4] [PMID: 28220440]
[http://dx.doi.org/10.3390/molecules21040476] [PMID: 27089312]
[http://dx.doi.org/10.1021/acs.jcim.6b00740] [PMID: 28368587]
[http://dx.doi.org/10.1016/j.drudis.2018.05.010] [PMID: 29750902]
[http://dx.doi.org/10.1016/j.drudis.2018.01.039] [PMID: 29366762]
[http://dx.doi.org/10.3390/app8091521]
[http://dx.doi.org/10.1016/j.neucom.2017.08.050]
[http://dx.doi.org/10.1016/j.neucom.2012.09.049]
[http://dx.doi.org/10.1016/j.eswa.2019.06.044]
[http://dx.doi.org/10.1021/ci100090p] [PMID: 20504032]
[http://dx.doi.org/10.1007/11752790_5]
[http://dx.doi.org/10.1023/A:1022602019183]
[http://dx.doi.org/10.1016/j.ins.2017.08.047]
[http://dx.doi.org/10.1007/s10489-019-01513-5]
[http://dx.doi.org/10.1007/s10489-019-01420-9]