Abstract
Biclustering analysis is a useful methodology to discover the local coherent patterns hidden in a data matrix. Unlike the traditional clustering procedure, which searches for groups of coherent patterns using the entire feature set, biclustering performs simultaneous pattern classification in both row and column directions in a data matrix. The technique has found useful applications in many fields but notably in bioinformatics. In this paper, we give an overview of the biclustering problem and review some existing biclustering algorithms in terms of their underlying methodology, search strategy, detected bicluster patterns, and validation strategies. Moreover, we show that geometry of biclustering patterns can be used to solve biclustering problems effectively. Well-known methods in signal and image analysis, such as the Hough transform and relaxation labeling, can be employed to detect the geometrical biclustering patterns. We present performance evaluation results for several of the well known biclustering algorithms, on both artificial and real gene expression datasets. Finally, several interesting applications of biclustering are discussed.
Keywords: Biclustering, clustering, gene expression data analysis, geometrical biclustering, multidimensional data analysis, pattern discovery, coherent evolutions, Negative Matrix Factorization algorithm, Flexible Overlapped biClustering, decomposition