Abstract
Innovative models for analyzing high-throughput biological data are becoming of great significance in the post genomic era. Correlation networks are rapidly becoming powerful models for representing various types of biological relationships especially in the case of extracting knowledge from gene expression data. Data analysis using of other popular networks models in biology have revealed that structures within a graph model, such as high degee nodes and cliques, often correspond to cellular functions. Correlation networks, which can be used to measure the relationships between patterns of gene expression, are capable of representing entire-genome expression assays. In this study we build correlation networks from gene expression datasets available in the public domain; once built, we are able to identify graph theoretic structures (critical nodes and dense subgraphs) and use measures of centrality to infer the biological impact of these structures within the network. We go on to validate the link between network components (such as critical nodes and degrees) and biological function of the model by exploring the biological properties of a set of nodes with high centrality measures in the correlation. In addition, we use network integration to identify essential genes in an integrated correlation network obtained by the union of networks of mice with different age groups. By examining clusters connected by highly central nodes in this integrated network, we were able to find a set of essential genes and identify several cellular subsystems that point towards aging related mechanisms. The obtained results provide clear evidence that correlation networks represent a powerful tool for analyzing temporal biological data and consequently make use of the wealth of gene expression assays currently available.
Keywords: Celluar subsystems, centrality measures, correlation networks, essential genes, graph parameters.