Abstract
Background: Identifying genes that instigate cell anomalies and cause cancer in humans is an important field in oncology research. Abnormalities in these genes are transferred to other genes in the cell, disrupting its normal functionality. Such genes are known as cancer driver genes (CDGs). Various methods have been proposed for predicting CDGs, mostly based on genomic data and computational methods. Some novel bioinformatic approaches have been developed.
Objective: In this article, we propose a network-based algorithm, SalsaDriver (Stochastic approach for link-structure analysis for driver detection), which can calculate each gene's receiving and influencing power using the stochastic analysis of regulatory interaction structures in gene regulatory networks.
Methods: First, regulatory networks related to breast, colon, and lung cancers are constructed using gene expression data and a list of regulatory interactions, the weights of which are then calculated using biological and topological features of the network. After that, the weighted regulatory interactions are used in the structural analysis of interactions, with two separate Markov chains on the bipartite graph taken from the main graph of the gene network and the implementation of the stochastic approach for link-structure analysis. The proposed algorithm categorizes higher-ranked genes as driver genes.
Results: The proposed algorithm was compared with 24 other computational and network tools based on the F-measure value and the number of detected CDGs. The results were validated using four databases. The findings of this study show that SalsaDriver outperforms other methods and can identify substantiallyy more driver genes than other methods.
Conclusion: The SalsaDriver network-based approach is suitable for predicting CDGs and can be used as a complementary method along with other computational tools.
Keywords: Driver genes, cancer, Link-structure analysis, regulatory interactions, cell anomalies, F-measure value.
Graphical Abstract
[http://dx.doi.org/10.1038/nature07423] [PMID: 18948947]
[http://dx.doi.org/10.1186/s13059-016-0994-0] [PMID: 27311963]
[http://dx.doi.org/10.1371/journal.pcbi.1004027] [PMID: 25569148]
[http://dx.doi.org/10.1038/nature13394] [PMID: 24896178]
[http://dx.doi.org/10.1093/bioinformatics/btq630] [PMID: 21169372]
[http://dx.doi.org/10.1093/nar/gks743] [PMID: 22904074]
[http://dx.doi.org/10.1038/srep02651] [PMID: 24089029]
[http://dx.doi.org/10.1016/j.ajhg.2013.07.003] [PMID: 23954162]
[http://dx.doi.org/10.1371/journal.pone.0053014] [PMID: 23382830]
[http://dx.doi.org/10.1038/nature12213] [PMID: 23770567]
[http://dx.doi.org/10.1093/bioinformatics/btt395] [PMID: 23884480]
[http://dx.doi.org/10.1093/bioinformatics/btu499] [PMID: 25064568]
[http://dx.doi.org/10.1186/s12864-016-3057-8] [PMID: 27612452]
[http://dx.doi.org/10.1038/srep41544] [PMID: 28128360]
[http://dx.doi.org/10.1371/journal.pone.0196939] [PMID: 29738578]
[http://dx.doi.org/10.1101/gr.120477.111] [PMID: 21653252]
[http://dx.doi.org/10.1093/nar/gkz096] [PMID: 30773592]
[http://dx.doi.org/10.1371/journal.pone.0008918] [PMID: 20169195]
[http://dx.doi.org/10.1186/gb-2012-13-12-r124] [PMID: 23383675]
[http://dx.doi.org/10.1093/bioinformatics/bts564] [PMID: 22982574]
[http://dx.doi.org/10.1101/gr.125567.111] [PMID: 21908773]
[http://dx.doi.org/10.1186/1471-2105-15-271] [PMID: 25106096]
[http://dx.doi.org/10.1186/s13073-014-0056-8] [PMID: 25177370]
[http://dx.doi.org/10.1093/bioinformatics/bty006] [PMID: 29329368]
[http://dx.doi.org/10.1016/j.compbiomed.2019.103362] [PMID: 31561101]
[http://dx.doi.org/10.1016/j.jbi.2020.103661] [PMID: 33326867]
[http://dx.doi.org/10.1002/9781119387596]
[http://dx.doi.org/10.1145/324133.324140]
[http://dx.doi.org/10.1016/S1389-1286(00)00034-7]
[http://dx.doi.org/10.1145/382979.383041]
[http://dx.doi.org/10.1093/nar/gkx1013] [PMID: 29087512]
[http://dx.doi.org/10.1093/nar/gkv1314] [PMID: 26635391]
[http://dx.doi.org/10.3892/ol.2017.5917] [PMID: 28529601]
[http://dx.doi.org/10.1186/1471-2105-12-77] [PMID: 21414208]