Abstract
Discovering a protein motif is an important research topic in both bioinformatics and protein sciences. This paper presents a novel motif discovery algorithm which is capable of finding a motif set to represent a protein family. The algorithm involves an abstraction method of important features, a location-sensitive connection approach to link two features, and a repeated connection procedure to generate a motif set. The novel algorithm is applied to discovering motifs in 21 ligase subfamilies. The results show that the obtained motifs are able to represent the characteristics of the subfamilies effectively. The proposed algorithm could become a potential useful tool for protein family prediction.
Keywords: Motif discovery, protein motif, sequence pattern matching, ligases, protein family, prediction