Abstract
The availability of complete genomes and global gene expression profiling has greatly facilitated analysis of complex genetic regulatory systems. We describe the use of a bioinformatics strategy for analyzing the cis-regulatory design of genes diferentially regulated during viral infection of a target cell. The large-scale transcriptional activity of human embryonic kidney (HEK293) cells to reovirus (serotype 3 Abney) infection was measured using the Affymetrix HU- 95Av2 gene array. Comparing the 2000 base pairs of 5 upstream sequence for the most differentially expressed genes revealed highly preserved sequence regions, which we call "modules". Higher-order patterns of modules, called "supermodules", were significantly over-represented in the 5 upstream regions of transcriptionally responsive genes. These supermodules contain binding sites for multiple transcription factors and tend to define the role of genes in processes associated with reovirus infection. The supermodular design encodes a cis-regulatory logic for transducing upstream signaling for the control of expression of genes involved in similar biological processes. In the case of reovirus infection, these processes recapitulate the integrated response of cells including signal transduction, transcriptional regulation, cell cycle control, and apoptosis. The computational strategies described for analyzing gene expression data to discover cisregulatory features and associating them with pathological processes represents a novel approach to studying the interaction of a pathogen with its target cells.
Keywords: Cis-regulation, gene arrays, promoters, sequence modules, transcription factors, Gene Ontology, signaling, reovirus, apoptosis