Abstract
Oncogenes are genes that have the potential to cause cancer. Oncogene research can provide insight into the occurrence and development of cancer, thereby helping to prevent cancer and to design effective treatments. This study proposes a network method called the oncogene prediction method based on shortest path algorithm (OPMSP) for the identification of novel oncogenes in a large protein network built using protein-protein interaction data. Novel putative genes were extracted from the shortest paths connecting any two known oncogenes. Then, they were filtered by a randomization test, and the linkages among them and known oncogenes were measured by protein interaction and sequence data. Thirty-seven new putative oncogenes were identified by this method. The enrichment analysis of the 37 putative oncogenes indicated that they are highly associated with several biological processes related to the initiation, progression and metastasis of tumors. Six of these genes—ESR1, CDK9, SEPT2, HOXA10, LMX1B, and NR2C2—are extensively discussed. Several lines of evidence indicate that they may be novel oncogenes.
Keywords: Oncogene, protein-protein interaction, protein sequence, BLAST, shortest path algorithm.
Graphical Abstract