Abstract
In order to find a thriving quantitative structure-activity relationship for antitrypanosomal activities (against Trypanosma brucei rhodesiense) of polyphenols that belong to different structural groups, multiple linear regression (MLR) and artificial neural networks (ANN) were employed. The analysis was performed on two different-sized training sets (59% and 78% molecules in the training set), resulting in relatively successful MLR and ANN models for the data set containing the smaller training set. The best MLR model obtained using the five descriptors (R3m+, GAP, DISPv, HATS2m, JGI2) was able to account only for 74% of the variance of antitrypanosomal activities of the training set and achieved a high internal, but low external prediction. Nonlinearities of the best ANN model compared with the linear model improved the coefficient of determination to 98.6%, and showed a better external predictive ability. The obtained models displayed relevance of the distance between oxygen atoms in molecules of polyphenols, as well as stability of molecules, measured by the difference between the energy of the highest occupied molecular orbital and the energy of the lowest unoccupied molecular orbital (GAP) for their activity.
Keywords: Antitrypanosomal activity, genetic algorithm, multiple linear regression, neural networks, polyphenols, QSAR.