Abstract
The applications of optimal molecular descriptors as a tool to predict endpoints related to medicinal chemistry are listed. The general scheme of building up of the optimal descriptors is represented in detail. Simplified molecular input-line entry system (SMILES) is being used to represent the molecular architecture. The optimal descriptor is the sum of correlation weights of molecular fragments extracted from SMILES. The numerical data on the correlation weights are calculated by the Monte Carlo method. The data should provide maximal correlation coefficient between experimental values of endpoint and corresponding values of the optimal descriptor. The scheme contains two phases: (i) selection of reliable parameters of the Monte Carlo optimization; and (ii) building up a model. The mechanistic interpretation for models based on the optimal descriptors is suggested. The interpretation is calculated on results of several runs of the Monte Carlo optimization. The domain of applicability for these models is defined according to the prevalence of molecular fragments in the training and calibration sets.
Keywords: CORAL software, medicinal chemistry, monte carlo method, optimal descriptor, QSAR, SMILES.
Graphical Abstract