Abstract
Background: The remarkable gapped palindrome structures can have profound effects on chromosomes and are responsible for neurological diseases in humans. Gapped palindromes refer to the palindromes that have a space (set of characters) between the left and right palindromic arms of the string. Gapped palindromes are divided into two classes: long armed and length constrained.
Objective: In practical applications such as DNA sequence analysis, it is desired to cope with the performance of gapped palindromes. Method: This paper presents efficient algorithms of O(n) for solving both types of gapped palindrome problem in biological sequences using enhanced suffix array. Results: Experimental results show that our algorithms are space efficient, faster and easy to implement. We have also provided an open source standalone application called fapa-gp for searching different classes of gapped palindromes in genome sequences. It includes source codes of the proposed algorithm, standalone application and other supplementary materials. Conclusion: The presented algorithms ensure finding long armed and length constrained versions of gapped palindromes in the biological DNA sequence, verifying all the conditions. Our algorithms analyzed short DNA sequences easily.Keywords: Palindromes, gapped palindromes, biological gapped palindromes, long armed, length constrained, enhanced suffix array.
Graphical Abstract