Abstract
In silico classification of new compounds for certain properties is a useful tool to guide further experiments or compound selection. Interaction of new compounds with the efflux pump P-glycoprotein (P-gp) is an important drug property determining tissue distribution and the potential for drug-drug interactions. We present three datasets on substrate, inhibitor, and inducer activities for P-gp (n = 471) obtained from a literature search which we compared to an existing evaluation of the Prestwick Chemical Library with the calcein- AM assay (retrieved from PubMed). Additionally, we present decision tree models of these activities with predictive accuracies of 77.7 % (substrates), 86.9 % (inhibitors), and 90.3 % (inducers) using three algorithms (CHAID, CART, and C4.5). We also present decision tree models of the calcein-AM assay (79.9 %). Apart from a comprehensive dataset of P-gp interacting compounds, our study provides evidence of the efficacy of logD descriptors and of two algorithms not commonly used in pharmacological QSAR studies (CART and CHAID).
Keywords: P-glycoprotein, MDR1, Multidrug resistance, Calcein AM assay, QSAR, decision trees