Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Perspective

Differential Gene Expression in Cancer: An Overrated Analysis?

Author(s): Jessica Carballido* and Rocío Cecchini

Volume 17, Issue 5, 2022

Published on: 15 June, 2022

Page: [396 - 400] Pages: 5

DOI: 10.2174/1574893617666220422134525

Abstract

The search for marker genes associated with different pathologies traditionally begins with some form of differential expression analysis. This step is essential in most functional genomics' works that analyze gene expression data. In the present article, we present a different analysis, starting from the known biological significance of different groups of genes and then assessing the proportion of differentially expressed genes. The analysis is performed in the context of cancer expression data to unveil the true importance of differential expression, approaching it from different research objectives. Firstly, it was seen that the percentage of differentially expressed genes is generally low concerning gene sets annotated in KEGG. On the other hand, it was observed that in the training and prediction process of both statistical and machine learning models, the fact of using differentially expressed genes sustainably improves their results.

Keywords: KEGG, Differential expression analysis (DE), RNA sequencing expression, cancer expression data, TCGA, amigo.

[1]
Zhong R, Chen D, Cao S, Li J, Han B, Zhong H. Immune cell infiltration features and related marker genes in lung cancer based on single-cell RNA-seq. Clin Transl Oncol 2021; 23(2): 405-17.
[http://dx.doi.org/10.1007/s12094-020-02435-2] [PMID: 32656582]
[2]
Yang C, Zhu Y, Ding Y, et al. Identifying the key genes and functional enrichment pathways associated with feed efficiency in cattle. Gene 2022; 807: 145934.
[http://dx.doi.org/10.1016/j.gene.2021.145934] [PMID: 34478820]
[3]
Geistlinger L, Csaba G, Santarelli M, et al. Toward a gold standard for benchmarking gene set enrichment analysis. Brief Bioinform 2021; 22(1): 545-56.
[http://dx.doi.org/10.1093/bib/bbz158] [PMID: 32026945]
[4]
Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z. GOrilla: A tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinformatics 2009; 10(1): 48.
[http://dx.doi.org/10.1186/1471-2105-10-48] [PMID: 19192299]
[5]
Carbon S, Ireland A, Mungall CJ, et al. Hub; Web Presence Working Group. AmiGO: Online access to ontology and annotation data. Bioinformatics 2009; 25(2): 288-9.
[http://dx.doi.org/10.1093/bioinformatics/btn615] [PMID: 19033274]
[6]
Bu D, Luo H, Huo P, et al. KOBAS-i: Intelligent prioritization and exploratory visualization of biological functions for gene enrichment analysis. Nucleic Acids Res 2021; 49(W1): W317-25.
[http://dx.doi.org/10.1093/nar/gkab447] [PMID: 34086934]
[7]
Shannon P, Markiel A, Ozier O, et al. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res 2003; 13(11): 2498-504.
[http://dx.doi.org/10.1101/gr.1239303] [PMID: 14597658]
[8]
Bader GD, Hogue CW. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 2003; 4(1): 2.
[http://dx.doi.org/10.1186/1471-2105-4-2] [PMID: 12525261]
[9]
Supek F, Bošnjak M, Škunca N, Šmuc T. REVIGO summarizes and visualizes long lists of gene ontology terms. PLoS One 2011; 6(7): e21800.
[http://dx.doi.org/10.1371/journal.pone.0021800] [PMID: 21789182]
[10]
Rouillard AD, Gundersen GW, Fernandez NF, et al. The harmonizome: A collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database (Oxford) 2016; 2016: baw100.
[http://dx.doi.org/10.1093/database/baw100] [PMID: 27374120]
[11]
Fonseka P, Pathan M, Chitti SV, Kang T, Mathivanan S. FunRich enables enrichment analysis of OMICs datasets. J Mol Biol 2021; 433(11): 166747.
[http://dx.doi.org/10.1016/j.jmb.2020.166747] [PMID: 33310018]
[12]
Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA 2005; 102(43): 15545-50.
[http://dx.doi.org/10.1073/pnas.0506580102] [PMID: 16199517]
[13]
Huang W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 2009; 4(1): 44-57.
[http://dx.doi.org/10.1038/nprot.2008.211] [PMID: 19131956]
[14]
Kanehisa M, Goto S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000; 28(1): 27-30.
[http://dx.doi.org/10.1093/nar/28.1.27] [PMID: 10592173]
[15]
Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M. Data, information, knowledge and principle: Back to metabolism in KEGG. Nucleic Acids Res 2014; 42(Database issue): D199-205.
[http://dx.doi.org/10.1093/nar/gkt1076] [PMID: 24214961]
[16]
Weinstein JN, Collisson EA, Mills GB, et al. Cancer Genome Atlas Research Network. The cancer genome atlas pan-cancer analysis pro-ject. Nat Genet 2013; 45(10): 1113-20.
[http://dx.doi.org/10.1038/ng.2764] [PMID: 24071849]
[17]
Goldman MJ, Craft B, Hastie M, et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol 2020; 38(6): 675-8.
[http://dx.doi.org/10.1038/s41587-020-0546-8] [PMID: 32444850]
[18]
Lonsdale J, Thomas J, Salvatore M, et al. GTEx Consortium.The genotype-tissue expression (GTEx) project. Nat Genet 2013; 45(6): 580-5.
[http://dx.doi.org/10.1038/ng.2653] [PMID: 23715323]
[19]
Chen EY, Tan CM, Kou Y, et al. Enrichr: Interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics 2013; 14(1): 128.
[http://dx.doi.org/10.1186/1471-2105-14-128] [PMID: 23586463]
[20]
Huang W, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: Paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res 2009; 37(1): 1-13.
[http://dx.doi.org/10.1093/nar/gkn923] [PMID: 19033363]
[21]
Robinson MD, McCarthy DJ, Smyth GK. edgeR: A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2010; 26(1): 139-40.
[http://dx.doi.org/10.1093/bioinformatics/btp616] [PMID: 19910308]
[22]
Akaike H. A new look at the statistical model identification. IEEE Trans Automat Contr 1974; 19(6): 716-23.
[http://dx.doi.org/10.1109/TAC.1974.1100705]
[23]
Rey D, Neuhäuser M. Wilcoxon-Signed-Rank TestInternational Encyclopedia of Statistical Science. Berlin, Heidelberg: Springer Berlin Heidelberg 2011.
[http://dx.doi.org/10.1007/978-3-642-04898-2_616]
[24]
Lakens D. Calculating and reporting effect sizes to facilitate cumulative science: A practical primer for t-tests and ANOVAs. Front Psychol 2013; 4: 863.
[http://dx.doi.org/10.3389/fpsyg.2013.00863] [PMID: 24324449]

© 2024 Bentham Science Publishers | Privacy Policy