Abstract
Background: To identify new bacterial type III secreted effectors is computationally a big challenge. At least a dozen machine learning algorithms have been developed, but so far have only achieved limited success. Sequence similarity appears important for biologists but is frequently neglected by algorithm developers for effector prediction, although large success was achieved in the field with this strategy a decade ago.
Objective: The study aimed to develop a sequence similarity based effector prediction tool.
Method: In this study, we propose a recursive sequence alignment strategy with Hidden Markov Models, to comprehensively find homologs of known YopJ/P full-length proteins, effector domains and N-terminal signal sequences.
Results: Using this method, we identified 155 different YopJ/P-family effectors and 59 proteins with YopJ/P N-terminal signal sequences from 27 genera and more than 70 species. Among these genera, we also identified one type III secretion system (T3SS) from Uliginosibacterium and two T3SSs from Rhizobacter for the first time. Higher conservation of effector domains, N-terminal fusion of signal sequences to other effectors, and the exchange of N-terminal signal sequences between different effector proteins were frequently observed for YopJ/P-family proteins. This made it feasible to identify new effectors based on separate similarity screening for the N-terminal signal peptides and the effector domains of known effectors. This method can also be applied to search for homologues of other known T3SS effectors.
Conclusion: A new sequence alignment based method was developed, which could effectively facilitate the identification of new T3SS effectors.
Keywords: Type III secreted effectors, Hidden Markov Model, YopJ/P, machine learning, effector proteins, sequence similarity.
Graphical Abstract