Abstract
Background: Type 2 Diabetes Mellitus (T2DM) is a chronic disease. The molecular diagnosis should be helpful for the treatment of T2DM patients. With the development of sequencing technology, a large number of differentially expressed genes were identified from expression data. However, the method of machine learning can only identify the local optimal solution as the signature.
Objective: The mutation information obtained by inheritance can better reflect the relationship between genes and diseases. Therefore, we need to integrate mutation information to more accurately identify the signature.
Methods: To this end, we integrated Genome-Wide Association Study (GWAS) data and expression data, combined with expression Quantitative Trait Loci (eQTL) technology to get T2DM predictive signature (T2DMSig-10). Firstly, we used GWAS data to obtain a list of T2DM susceptible loci. Then, we used eQTL technology to obtain risk Single Nucleotide Polymorphisms (SNPs), and combined with the pancreatic β-cells gene expression data to obtain 10 protein-coding genes. Next, we combined these genes with equal weights.
Results: After Receiver Operating Characteristic (ROC), single-gene removal and increase method, gene ontology function enrichment and protein-protein interaction network were used to verify the results showed that T2DMSig-10 had an excellent predictive effect on T2DM (AUC=0.99), and was highly robust.
Conclusion: In short, we obtained the predictive signature of T2DM, and further verified it.
Keywords: Type 2 diabetes mellitus, genome-wide association study, expression quantitative trait loci, predictive signature, AUC=0.99, ROC.
Graphical Abstract
[http://dx.doi.org/10.2174/1570161117666190502103733] [PMID: 31057114]
[http://dx.doi.org/10.1093/nar/gkz843] [PMID: 31584099]
[http://dx.doi.org/10.1016/S2213-8587(20)30272-2] [PMID: 32798472]
[http://dx.doi.org/10.3389/fgene.2019.00094] [PMID: 30891058]
[http://dx.doi.org/10.1038/s41591-018-0231-3] [PMID: 30297896]
[http://dx.doi.org/10.3389/fgene.2018.00515] [PMID: 30459809]
[http://dx.doi.org/10.1038/nature15393] [PMID: 26432245]
[http://dx.doi.org/10.1038/nature06258] [PMID: 17943122]
[http://dx.doi.org/10.3892/mmr.2019.10522] [PMID: 31524257]
[http://dx.doi.org/10.2174/1566523218666181010101114] [PMID: 30306867]
[PMID: 26134276]
[http://dx.doi.org/10.1016/j.omtn.2019.09.019] [PMID: 31678735]
[http://dx.doi.org/10.2174/156652321904191022113307] [PMID: 31762421]
[http://dx.doi.org/10.1109/TCBB.2017.2776280] [PMID: 29990255]
[http://dx.doi.org/10.1093/database/baaa050] [PMID: 32588040]
[http://dx.doi.org/10.1038/ng.2383] [PMID: 22885922]
[http://dx.doi.org/10.1186/1471-2164-16-S8-S4] [PMID: 26110739]
[http://dx.doi.org/10.1093/nar/29.1.308] [PMID: 11125122]
[http://dx.doi.org/10.3389/fbioe.2020.00254] [PMID: 32292778]
[http://dx.doi.org/10.1093/bib/bbaa017] [PMID: 32065211]
[http://dx.doi.org/10.1038/s41574-020-0325-0] [PMID: 32099086]
[http://dx.doi.org/10.1093/nar/gkv007] [PMID: 25605792]
[http://dx.doi.org/10.1055/s-0042-109607] [PMID: 29933462]
[http://dx.doi.org/10.18632/aging.101375] [PMID: 29410390]
[http://dx.doi.org/10.1186/1471-2105-12-77] [PMID: 21414208]
[http://dx.doi.org/10.1002/dmrr.2894] [PMID: 28303682]
[http://dx.doi.org/10.1111/1753-0407.12778] [PMID: 29726111]
[http://dx.doi.org/10.1093/nar/gkn923]
[http://dx.doi.org/10.1093/nar/gky1131] [PMID: 30476243]
[http://dx.doi.org/10.1038/s41467-018-04918-x] [PMID: 29992946]
[PMID: 31324086]
[http://dx.doi.org/10.1038/s41588-019-0513-9] [PMID: 31676859]
[http://dx.doi.org/10.1590/1806-9282.64.07.586]
[http://dx.doi.org/10.1016/j.isci.2020.100991]
[http://dx.doi.org/10.1089/cmb.2018.0004] [PMID: 30113871]
[http://dx.doi.org/10.1093/bioinformatics/btz254] [PMID: 30977780]
[http://dx.doi.org/10.1093/bib/bbaa036] [PMID: 32249297]