Abstract
Spectral estimation techniques are widely used in modern signal processing systems. Recently, they have found important applications to the analysis of DNA data. In this paper, we review parametric and non-parametric spectral estimation methods for DNA sequence and microarray data analysis. The discrete Fourier transform (DFT) is the most commonly used technique for spectral analysis of digital signals. It can reveal the gene locations in a DNA sequence. The DFT can also be used to detect repetitive elements in a DNA sequence. The DFT produces the so-called windowing or data truncation artifacts when it is applied to a short data segment. Parametric spectral estimation methods, such as the autoregressive (AR) model, overcome this problem and can be used to obtain a high-resolution spectrum of the input signal. In this paper, we demonstrate the advantages of the AR model for the identification of protein coding regions and the detection of DNA repeats. We also review DFT and AR models and other spectral estimation techniques for the analysis of microarray time series data.
Keywords: Signal processing, spectral estimation, the Fourier transform (FT), the autoregressive (AR) model, the maximum entropy method (MEM), DNA sequence analysis, gene prediction, detection of DNA repeats, microarray time series data analysis