Abstract
Background: Metabolic chemical reaction is one of the main types of fundamental processes to maintain life. Generally, each reaction needs an enzyme. The metabolic pathway collects a series of chemical reactions at the system level. As compounds and enzymes are two important components in each metabolic pathway, identification of metabolic pathways that a given compound or enzyme can participate is the first important step for understanding the mechanism of metabolic pathways.
Objective: The purpose of this study was to build efficient computational methods to predict the metabolic pathways of compounds and enzymes.
Methods: Novel multi-label classifiers were proposed to identify metabolic pathway types, reported in KEGG, of compounds and enzymes. Three heterogeneous networks defining compounds and enzymes as nodes were constructed. To extract more informative features of compounds and enzymes, we generalized the powerful network embedding algorithm, Mashup, to its heterogeneous network version, named MashupH. RAndom k-labELsets (RAKEL) was employed to build the classifiers and support vector machine or random forest was selected as the base classification algorithm.
Results: The 10-fold cross-validation results indicated the good performance of the proposed classifiers and such performance was superior to the previous classifier that adopted features yielded by Mashup. Furthermore, some key parameters of MashupH that might contribute to or influence the classifiers were analyzed.
Conclusion: The features yielded by MashupH were more informative than those produced by Mashup on heterogeneous networks. This was the main reason the new classifiers were superior to those using features yielded by Mashup.
Graphical Abstract