Abstract
Determination of drug’s target protein is very important for studying drug-target interaction network, while drug-target interaction network is a key area in the drug discovery pipeline. Thus correct prediction of drug’s target protein is very helpful to promote the development of drug discovery. In this study, we developed a two-step similarity-based method to predict drug’s target group. In each step, a similarity score (obtained by graph representation in the first step, and chemical functional group representation in the second step) was employed to make prediction. Since some drugs can target proteins distributing in more than one group of proteins, the method provided a series of candidate target groups for each drug. As a result, the first-order prediction accuracy on training set and test set were 79.01% and 76.43%, respectively, which were much higher than the success rate of a random guess. The results show that using graph representation to encode drug is a good choice in this area. We expect that this contribution will provide some help to understand drugtarget interaction network.
Keywords: Drug-target interaction network, graph representation, chemical functional group representation, jackknife crossvalidation, KEGG, multi-label classification, drug discovery, random guess