Abstract
A sequence alignment algorithm is a basic building block for protein analysis and nucleic acid analysis in bioinformatics. Such alignment represents the similarities and differences of two or more compared sequences. Thus, there have been many algorithms and tools studied and developed. In this paper, we focus on the PHRED based sequence alignment algorithm using Needleman-Wunsch dynamic programming. Although it is well known and proven to be reliable to some extent, it suffers from the heavy computation of producing scoring matrix based on dynamic programming whose time complexity is O(mn). We propose a method applying quadrant method in that process to reduce the computational loads. Also, PHRED based algorithms suffer from the environment when low quality bases are frequently in tips of DNA fragments. Thus, we designed a fuzzy logic system to control the gap cost dynamically to improve the quality of the alignment. In the experiment using real genome data from NCBI (National Center for Biotechnology Information), we verify that the proposed method reduces the computational loads by half in producing scoring matrix and thus the alignment quality is also improved by our fuzzy inference system.
Keywords: DNA sequence alignment, dynamic programming, fuzzy inference system, gap cost, quadrant, quality score.