Abstract
Background: Different cancers have different sites of origin, cell types, and forms of genetic mutations that manifest different forms of cancers. Therefore, identifying genes associated with cancer traits and analyzing their functions in different cancers is important for understanding the mechanisms of cancer.
Objective: The purpose of this paper is to make up for the shortcomings of single tumor analysis and realize the discovery and identification of genes related to each cancer trait at the level of multiple tumors, and their association analysis.
Methods: In this paper, we use structural equation model to quantitatively identify genes associated with cancer traits for five cancers. We verify the correctness and effectiveness of the method through correlation analysis. Then we analyze the functions of genes and the biological processes involved through GO and KEGG pathways. Finally, we further analyze and verify the experimental results through protein interaction network and survival analysis.
Results: Through five types of cancer data, we identify 44 genes related to cancer traits. We verify the combined effects of these genes and the biological processes they participate in. Moreover, we find key gene pathways and two significant gene functional modules.
Conclusion: The results show that the structural equation model has unique advantages in quantifying the combined effects of genes. Many of the genes we have identified are tumor metastasis genes and are related to many cancers. There are strong potential commonalities among cancers. Four cancer genes are not only related to protein metabolism but also related to the regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolism, which are of great significance for cancer treatment.
Keywords: Pan-cancer, cancer traits, structural equation model, correlation analysis, pathway analysis, tumor metastasis.
Graphical Abstract