Abstract
Identifying protein coding regions of DNA sequences is an important step in gene annotation. It is well-known that the protein coding regions of most genomic sequences exhibit a period-3 pattern due to non-uniform distribution of codons. In order to identify protein coding regions more efficiently, a new identification approach was proposed by combining chirp z transform and wavelet transform based on period-3 property. The identification method was applied to 17 DNA sequences of different organisms and achieved a high sensitivity (>80%) in all sequences. Demanding no prior training sets, the approach is fast and could potentially be used widely and conveniently.
Keywords: Chirp z transform, fourier transform, identification, period-3 pattern, protein-coding regions, wavelet transform.