Abstract
Background: Experimental approaches to identify drug-target interactions (DTIs) among a large number of chemical compounds and proteins are still costly and time-consuming. As an assistant, computational approaches are able to rapidly infer potential drug or target candidates for diverse screenings on a large scale. The most difficult scenario (S4) of screenings tries to explore the pairwise interacting candidates between newly designed chemical compounds (new potential drugs) and proteins (new target candidates). Few of current computational approaches can be applied to the inference of potential DTIs in S4 because the new potential drugs have no known target and the new target candidates have no existing drug at all. In addition, due to the essential issues among DTI, such as missing DTIs and the imbalance between few approved DTIs and many unknown drug-target pairs, existing metrics of DTI inference may not reflect the performance of inferring approaches fairly.
Methods: To address these issues, this paper develops three instance neighborhood-based models: individualto- individual (I2I), individual-to-group (I2G) and nearest-neighbor-zone (NNZ). In I2I, if a new drug tends to interact with individual targets similar to a new target of interest, it likely interacts with the new target. In I2G, the new drug possibly interacts with the new target if it tends to interact with a target group, in which member targets are similar to each other and one or more of them are similar to the new target. In NNZ, the pair of the new drug and the new target is a potential DTI if it is similar to known existing DTIs. This paper also designs a topological dense index to guide the selection of the appropriate models when given different datasets. Moreover, an additional metric Coverage is introduced to enhance the assessment of DTI inference.
Results: Performed on four benchmark datasets, our models demonstrate that the instance neighborhood can improve the DTI inference significantly. Under the guidance of our topological dense index, the best models for the datasets are chosen and achieve inspiring performances, including ~85%, ~81%, ~86% and ~81% in terms of AUC and ~29%, ~32%, ~32% and ~33% in terms of AUPR respectively. The superiority of our models is demonstrated by both the comparison with two state-of-the-art approaches and the novel DTI inference.
Conclusion: By leveraging the instance neighborhood, our models are able to infer DTIs in the most difficult scenario S4. Moreover, our topological dense index can guide the appropriate models when given different datasets.
Keywords: Drug-target interaction, neighborhood, drug similarity, target similarity, classification, state-of-the-art.
Graphical Abstract