Abstract
Background: Gene regulation represents a very complex mechanism in the cell initiated to increase or decrease gene expression. This regulation of genes forms a Gene regulatory Network GRN composed of a collection of genes and products of genes in interaction. The high throughput technologies that generate a huge volume of gene expression data are useful for analyzing the GRN. The biologists are interested in the relevant genetic knowledge hidden in these data sources. Although, the knowledge extracted by the different data mining approaches of the literature is insufficient for inferring the GRN topology or does not give a good representation of the real genetic regulation in the cell.
Objective: In this work, we performed the extraction of genetic interactions from the high throughput technologies, such as the microarrays or DNA chips.
Methods: In this paper, in order to extract expressive and explicit knowledge about the interactions between genes, we used the method of gradual patterns and rules extraction applied on numerical data that extracts the frequent co-variations between gene expression values. Furthermore, we choose to integrate experimental biological data and biological knowledge in the process of knowledge extraction of genetic interactions.
Results: The validation results on real gene expression data of the model plant Arabidopsis and human lung cancer showed the performance of this approach.
Conclusion: The extracted gradual rules express the genetic interactions composed of a GRN. These rules help to understand complex systems and cellular functions.
Keywords: Knowledge extraction, gene expression data, gradual rules, GRN, genetic interaction, arabidopsis, floral development, lung cancer.
Graphical Abstract
[http://dx.doi.org/10.1038/nprot.2009.230] [PMID: 20134430]
[http://dx.doi.org/10.1016/j.biosystems.2017.08.006] [PMID: 28860069]
[http://dx.doi.org/10.1145/170035.170072]
[http://dx.doi.org/10.1038/353031a0] [PMID: 1715520]
[http://dx.doi.org/10.1051/jbio/2012006] [PMID: 22463997]
[http://dx.doi.org/10.1038/ng1543] [PMID: 15806101]
[http://dx.doi.org/10.1198/016214504000000683]
[http://dx.doi.org/10.1093/nar/gkv007] [PMID: 25605792]
[http://dx.doi.org/10.1093/treephys/tpz021] [PMID: 30976801]
[http://dx.doi.org/10.1007/s10618-013-0313-2]
[http://dx.doi.org/10.1093/bioinformatics/btw024] [PMID: 26794315]
[http://dx.doi.org/10.1016/S0002-9440(10)61257-6] [PMID: 16314486]