Abstract
Background: Clustering is one of the most important data mining methods. The k-means (c-means ) and its derivative methods are the hotspot in the field of clustering research in recent years. The clustering method can be divided into two categories according to the uncertainty, which are hard clustering and soft clustering. The Hard C-Means clustering (HCM) belongs to hard clustering while the Fuzzy C-Means clustering (FCM) belongs to soft clustering in the field of k-means clustering research respectively. The linearly separable problem is a big challenge to clustering and classification algorithm and further improvement is required in big data era.
Objective: RKM algorithm based on fuzzy roughness is also a hot topic in current research. The rough set theory and the fuzzy theory are powerful tools for depicting uncertainty, which are the same in essence. Therefore, RKM can be kernelized by the mean of KFCM. In this paper, we put forward a Kernel Rough K-Means algorithm (KRKM) for RKM to solve nonlinear problem for RKM. KRKM expanded the ability of processing complex data of RKM and solve the problem of the soft clustering uncertainty.
Methods: This paper proposed the process of the Kernel Rough K-Means algorithm (KRKM). Then the clustering accuracy was contrasted by utilizing the data sets from UCI repository. The experiment results shown the KRKM with improved clustering accuracy, comparing with the RKM algorithm.
Results: The classification precision of KFCM and KRKM were improved. For the classification precision, KRKM was slightly higher than KFCM, indicating that KRKM was also an attractive alternative clustering algorithm and had good clustering effect when dealing with nonlinear clustering.
Conclusion: Through the comparison with the precision of KFCM algorithm, it was found that KRKM had slight advantages in clustering accuracy. KRKM was one of the effective clustering algorithms that can be selected in nonlinear clustering.
Keywords: K-Means, kernel function, rough set, clustering, big data, KRKM.
Graphical Abstract
[http://dx.doi.org/10.1016/S0165-0114(02)00246-4]
[http://dx.doi.org/10.1007/BF01001956]
[http://dx.doi.org/10.1080/00207160.2015.1124099]
[http://dx.doi.org/10.1155/2011/164956]
[http://dx.doi.org/10.1016/j.asoc.2015.12.031]
[http://dx.doi.org/10.1007/BF02530506]
[http://dx.doi.org/10.1162/089976698300017467]
[http://dx.doi.org/10.1109/TFUZZ.2010.2087382]
[http://dx.doi.org/10.1016/j.neucom.2015.01.106]
[http://dx.doi.org/10.1016/j.patcog.2011.02.009]
[http://dx.doi.org/10.1016/j.fss.2009.10.021]
[http://dx.doi.org/10.1016/j.amc.2015.11.001]
[http://dx.doi.org/10.1016/j.amc.2015.11.001]