Abstract
The search for marker genes associated with different pathologies traditionally begins with some form of differential expression analysis. This step is essential in most functional genomics' works that analyze gene expression data. In the present article, we present a different analysis, starting from the known biological significance of different groups of genes and then assessing the proportion of differentially expressed genes. The analysis is performed in the context of cancer expression data to unveil the true importance of differential expression, approaching it from different research objectives. Firstly, it was seen that the percentage of differentially expressed genes is generally low concerning gene sets annotated in KEGG. On the other hand, it was observed that in the training and prediction process of both statistical and machine learning models, the fact of using differentially expressed genes sustainably improves their results.
Keywords: KEGG, Differential expression analysis (DE), RNA sequencing expression, cancer expression data, TCGA, amigo.
[http://dx.doi.org/10.1007/s12094-020-02435-2] [PMID: 32656582]
[http://dx.doi.org/10.1016/j.gene.2021.145934] [PMID: 34478820]
[http://dx.doi.org/10.1093/bib/bbz158] [PMID: 32026945]
[http://dx.doi.org/10.1186/1471-2105-10-48] [PMID: 19192299]
[http://dx.doi.org/10.1093/bioinformatics/btn615] [PMID: 19033274]
[http://dx.doi.org/10.1093/nar/gkab447] [PMID: 34086934]
[http://dx.doi.org/10.1101/gr.1239303] [PMID: 14597658]
[http://dx.doi.org/10.1186/1471-2105-4-2] [PMID: 12525261]
[http://dx.doi.org/10.1371/journal.pone.0021800] [PMID: 21789182]
[http://dx.doi.org/10.1093/database/baw100] [PMID: 27374120]
[http://dx.doi.org/10.1016/j.jmb.2020.166747] [PMID: 33310018]
[http://dx.doi.org/10.1073/pnas.0506580102] [PMID: 16199517]
[http://dx.doi.org/10.1038/nprot.2008.211] [PMID: 19131956]
[http://dx.doi.org/10.1093/nar/28.1.27] [PMID: 10592173]
[http://dx.doi.org/10.1093/nar/gkt1076] [PMID: 24214961]
[http://dx.doi.org/10.1038/ng.2764] [PMID: 24071849]
[http://dx.doi.org/10.1038/s41587-020-0546-8] [PMID: 32444850]
[http://dx.doi.org/10.1038/ng.2653] [PMID: 23715323]
[http://dx.doi.org/10.1186/1471-2105-14-128] [PMID: 23586463]
[http://dx.doi.org/10.1093/nar/gkn923] [PMID: 19033363]
[http://dx.doi.org/10.1093/bioinformatics/btp616] [PMID: 19910308]
[http://dx.doi.org/10.1109/TAC.1974.1100705]
[http://dx.doi.org/10.1007/978-3-642-04898-2_616]
[http://dx.doi.org/10.3389/fpsyg.2013.00863] [PMID: 24324449]