Abstract
Background: MicroRNAs (miRNAs) are short non-coding RNAs that serve as key regulators at post-transcriptional level in many important biological processes. In recent years, miRNA expression profiles have been largely investigated and demonstrated to be promising biomarkers for discriminating subtypes of complex human diseases and measuring treatment effects.
Methods: Most of the analysis approaches for DNA microarray data can be applied to miRNA microarray data, such as statistical test for differential expression analysis, and clustering for coregulation analysis. Benefitting from the comprehensive annotation available for protein -coding genes, gene expression analysis is usually guided by prior biological knowledge in order to obtain more biologically meaningful results. However, functional annotation of miRNAs is relatively few, thus the prior knowledge-based methods are hard to be applied to miRNAs. In this paper, we incorporate gene ontology information of the target genes of miRNAs for the clustering of miRNAs , and propose a combined similarity measure.
Results: The experiments were conducted on two public miRNA microarray data sets. Experimental results show that the new similarity measure can improve the quality of clustering with regard to the classification accuracy and functional enrichment significance of clusters.
Conclusion: The clustering of microRNA expression profiles can be improved by incorporating domain knowledge, thus resulting in more functionally compact clusters, which are the basis for the identification of potential miRNA biomarkers and the construction of miRNA co-regulation networks.
Keywords: microRNA, clustering, functional similarity, biomarkers, microarray data, functional annotation.
Graphical Abstract