Abstract
Background: Event log data generated in the software development process contains historical information and future trends in software development activities. The mining and analysis of event log data contribute to identify and discover software development activities and provide effective support for software development process mining and modeling.
Methods: Firstly, a deep learning model (Word2vec) was used for feature extraction and vectorization of software development process event logs. Then, the K-means clustering algorithm and measure of silhouette coefficient and intra-cluster SSE were used for clustering and clustering effect evaluation of vectorized software development process event logs.
Results: This paper obtained the mapping relationship between software development activities and events, and realized the identification and discovery of software development activities.
Conclusion: Two practical software development projects (jEdit and Argouml) are given to prove the feasibility, rationality and effectiveness of our proposed method. This work provides effective support for software development process mining and software development behavior guidance.
Keywords: Event log, process mining, software development activity, clustering analysis, event, environment.
Graphical Abstract