Abstract
Aim: To develop a prediction model grounded on Machine Learning using Support Vector Machine (SVM).
Background: Prediction of workload in a Cloud Environment is one of the primary task in provisioning resources. Forecasting the requirements of future workload lies in the competency of predicting technique which could maximize the usage of resources in a cloud computing environment.
Objective: To reduce the training time of SVM model.
Methods: K-Means clustering is applied on the training dataset to form ‘n’ clusters firstly. Then, for every tuple in the cluster, the tuple’s class label is compared with the tuple’s cluster label. If the two labels are identical then the tuple is rightly classified and such a tuple would not contribute much during the SVM training process that formulates the separating hyperplane with lowest generalization error. Otherwise the tuple is added to the reduced training dataset. This selective addition of tuples to train SVM is carried for all clusters. The support vectors are a few among the samples in reduced training dataset that determines the optimal separating hyperplane.
Results: On Google Cluster Trace dataset, the proposed model incurred a reduction in the training time, Root Mean Square Error and a marginal increase in the R2 Score than the traditional SVM. The model has also been tested on Los Alamos National Laboratory’s Mustang and Trinity cluster traces.
Conclusion: The Cloudsim’s CPU utilization (VM and Cloudlet utilization) was measured and it was found to increase upon running the same set of tasks through our proposed model.
Keywords: Cloud computing, machine learning, K-means clustering, support vector machine (SVM), CPU, tuple's cluster label.
Graphical Abstract
[http://dx.doi.org/10.1145/2391229.2391236]
[http://dx.doi.org/10.1007/978-1-4757-2440-0]
[http://dx.doi.org/10.1023/A:1009715923555]
[http://dx.doi.org/10.3115/1073336.1073361]
[http://dx.doi.org/10.1109/AICI.2010.207]
[http://dx.doi.org/10.1155/2015/745815]
[http://dx.doi.org/10.1109/SC.2012.68]
[http://dx.doi.org/10.1109/TCC.2014.2306427]
[http://dx.doi.org/10.1109/BigData.Congress.2014.108]
[http://dx.doi.org/10.1109/ICCITechn.2014.6997346]
[http://dx.doi.org/10.1109/UCC.2014.17]
[http://dx.doi.org/10.1109/CLOUD.2014.78]
[http://dx.doi.org/10.1155/2014/321231] [PMID: 24701160]
[http://dx.doi.org/10.1002/spe.995]