Abstract
In this paper we report quantitative structure-activity models linking in vivo Drug-Induced Liver Injury (DILI) of organic molecules with some parameters both measured experimentally in vitro and calculated theoretically from the molecular structure. At the first step, a small database containing information of DILI in humans was created and annotated by experimentally observed information concerning hepatotoxic effects. Thus, for each compound a binary annotation “yes/no” was applied to DILI and seven endpoints causing different liver pathologies in humans: Cholestasis (CH), Oxidative Stress (OS), Mitochondrial injury (MT), Cirrhosis and Steatosis (CS), Hepatitis (HS), Hepatocellular (HC), and Reactive Metabolite (RM). Different machine-learning methods were used to build classification models linking DILI with molecular structure: Support Vector Machines, Artificial Neural Networks and Random Forests. Three types of models were developed: (i) involving molecular descriptors calculated directly from chemical structure, (ii) involving selected endpoints as “biological” descriptors, and (iii) involving both types of descriptors. It has been found that the models based solely on molecular descriptors have much weaker prediction performance than those involving in vivo measured endpoints. Taking into account difficulties in obtaining of in vivo data, at the validation stage we used instead five endpoints (CH, CS, HC, MT and OS) measured in vitro in human hepatocyte cultures. The models involving either some of experimental in vitro endpoints or their combination with theoretically calculated ones correctly predict DILI for 9 out of 10 reference compounds of the external test set. This opens an interesting perspective to use for DILI predictions a combination of theoretically calculated parameters and measured in vitro biological data.
Keywords: Biological descriptor, drug-induced liver injury, human hepatocyte cultures, machine-learning methods, molecular descriptors.