Abstract
Despite substantial recent progress, gene structural prediction remains a challenging problem in bioinformatics. The importance of a detailed understanding of gene splicing can be underlined by noting that ∼10-15% of human genetic diseases are caused by mutations that affect splice junctions. We briefly introduce the problem, mention the existing ap-proaches to gene structural annotation and provide overview of current methods. In particular, this paper explains why homology-based gene structural prediction appears to be more difficult then it might seem. The problem of splice sites (SSs) sensor design is overviewed with rigorous comparison of key designs. Finally, a discussion of methods in ab initio gene structural prediction is accompanied by an extensive comparative performance study. We make certain conclusions regarding the current state of the art and try to speculate about future research directions. Applications used to evaluate performance characteristics for various gene structur al prediction programs are available online at http://www. wyomingbioinformatics.org/∼achurban/.
Keywords: Gene Structural Prediction, Non-Canonical Gene Set, Weight Matrix Method (WMM), Maximal Dependence Decomposition (MDD), translation initiation sites (TISs)