Abstract
Background: Today, there are various theories about the causes of hereditary diseases, but doctors believe that both genetic and environmental factors play an essential role in the incidence and spread of these diseases.
Objective: In order to identify genes that are cause the disease, inter-cell or inter-tissue communications must be determined. The inter-cells or inter-tissues interaction could be illustrated by applying the gene expression. The disorders that have led to widespread changes could be identified by investigating gene expression information.
Methods: In this paper, identifying inter-cell and inter-tissue communications for various diseases has been accomplished utilizing an innovative similarity criterion of the graph topological structure characteristics and an extended clustering ensemble. The proposed method is performed in two stages: first, several clustering models have been combined to detect initial inter-cell or inter-tissue communications and produce better results than singular algorithms. Second, the cell-to-cell or tissue-totissue similarity in each cluster is identified through a similarity criterion based on the graph topological structure.
Results: The evaluation of the proposed method has been carried out, benefiting the UCI and FANTOM5 datasets. The results of experiments over FANTOM5 dataset report that the Silhouette coefficient equals 0.901 in 18 clusters for cells and equal to 0.762 in 13 clusters for tissues.
Conclusion: The maximum inter-cells or inter-tissues similarity in each cluster can be exploited to detect the relationships between diseases.
Keywords: Patients behavior modeling, communications, clustering ensemble, topological graph structure, FANTOM5, cell.
Graphical Abstract