Abstract
For historical reasons, bioinformatics has been focused on the analysis of coding sequences. Genome annotation mainly consists in the localization of putative open reading frames, and assignment of a function to their product. Non-coding sequences are, however, an essential part of the information contained in a genome. In particular, these sequences mediate transcriptional regulation, which is crucial to many aspects of life: metabolic regulation, embryonic development, cell cycle, immune response, etc. In silico analysis of non-coding sequences can provide important information about gene function, and about the way genes interact with each other to form molecular networks. Different algorithms are required for the prediction of regulatory elements than for the analysis of coding and protein sequences. This paper discusses several recent attempts to analyze the non-coding fraction of whole genomes, and emphasizes upon different ways by which comparative genomics has been used to improve the prediction of regulatory elements.
Keywords: bioinformatics, transcriptional regulation, comparative genomics, pattern discovery, sequence analysis