Abstract
Carefully developed quantitative structure-activity and structure-property relationship models contain detailed information regarding how differences in the molecular structure of compounds correlate with differences in the observed biological or other physicochemical properties of those compounds. The ability to understand the behavior of existing molecules and to design new molecules is facilitated by using an objective method to extract and explain the details of the underlying structure-activity or structure-property relationship. Furthermore, a clear understanding of how and why compounds behave as they do can lead to new innovations through model-directed selection of compounds to be used in complex mixtures such as laundry detergents, fabric softeners, and shampoos. Such a method has been developed based on partial least-squares (PLS) regression analysis that allows for the identification of specific structural trends that relate to differences in observed properties. But the analysis of the completed model is only the last step of the process. The model development process itself affects the ability to extract a clear interpretation of the model. Everything from the selection of initial pool of molecular descriptors to evaluate to data set and model optimization impacts the ability to derive detailed molecular design information. This review describes the method details and examples of the use of PLS for model interpretation and also outlines suggestions regarding model development and model and data set optimization that enable the interpretation process.
Keywords: Critical micelle concentration, model interpretation, molecular descriptors, partial least squares, perfume delivery, QSAR, QSPR, regression analysis, structure-property correlation, drug design