IMPLEMENTION DATA MINING IN SOFTWARE ENGINEERING
Keywords:
Software Engineering, Data Mining, Hidden Patterns, Software DesignAbstract
When it comes to software development, software companies generate enormous amounts of data. From the requirements phase all the way through to software maintenance, a new collection of data is generated at every step. Efforts are made to gather and keep data created in software repositories in order to improve the quality of the software. Software repositories include a vast amount of data, which is mined using different Data Mining methods to identify new patterns or highlights in the data. Study in this field has recently been a favourite multidisciplinary research subject for Software Engineering and Data Mining researchers. An attempt is made in this study to look at the numerous applications of data mining in software engineering, the various forms of software engineering data that may be mined, as well as different data mining techniques that are accessible and have been utilised by researchers to tackle their relevant issues. It's now time to narrow down which software engineering topic is grabbing academics' interest the most, based on this categorization.
References
. M. Halkidi, D. Spinellis, G. Tsatsaronis et al., “Data mining in software engineering,” Intelligent Data Analysis, vol. 15, no. 3, pp. 413-441, 2011.
. A. E. Hassan, and R. C. Holt, “Predicting change propagation in software systems,” in Proceedings of the 20th IEEE International Conference on Software Maintenance, 2004, pp. 284-293.
. Chaturvedi K.K, Singh V.B, Singh P, “Tools in Mining Software Repositories”, 13th International Conference on Computational Science and Its Applications, pp. 89-98, 2013
. J. Huffman Hayes, A. Dekhtyar and J. Osborne, Improving requirements tracing via information retrieval. In Proceedings of the International Conference on Requirements Engineering, 2003.
. J. Huffman Hayes, A. Dekhtyar and S. Sundaram, Text mining for software engineering: How analyst feedback impacts final results. In Proceedings of International Workshop on Mining Software Repositories (MSR), 2005.
. D. German and A. Mockus, Automating the measurement of open source projects. In Proceedings of the 3rd Workshop on Open Source Software Engineering, 25th International Conference on Software Engineering (ICSE03), 2003.
. C. Jensen andW. Scacchi, Datamining for software process discovery in open source software development communities. In Proceedings of International Workshop on Mining Software Repositories (MSR), 2004.
. C.C.Williams and J.K.Hollingsworth,Automatingmining of source code repositories to improve bug finding techniques, IEEE Transactions on Software Engineering 31(6) (2005), 466–480.
. S.Morisaki,A.Monden andT.Matsumura, Defect data analysis based on extended association rulemining. InProceedings of International Workshop on Mining Software Repositories (MSR), 2007.
. R Chang, A. Podgurski and J. Yang, Discovering neglected conditions in software by mining dependence graphs, IEEE Transactions on Software Engineering, 2008.
. W. Dickinson, D. Leon and A. Podgurski, Finding failures by cluster analysis of execution profiles, International Conference on Software Engineering (ICSE), 2001.
. M. Last,M. Friedman and A. Kandel, The Data Dimining Approach to Automated Software Testing, In Proceeding of the SIGKDD Conference, 2005.
. J. Bowring, J. Rehg and M.J. Harrold, Acive learning for automatic classification of software behavior, International Symposium on Software Testing and Analysis (ISSTA), 2004.
. C. Liu, X Yan, and J. Han. Mining control ow abnormality for logical errors. In Proceedings of SIAM Data Mining Conference (SDM), 2006.
. C. Liu, X. Yan, H. Yu, J. Han and P. Yu, Mining behavior graphs for ‘backtrace’ of noncrasinh bugs. In SIAM Data Mining Conference (SDM), 2005.
. Y. Kannelopoulos, Y. Dimopoulos, C. Tjortjis and C. Makris, Mining source code elements for comprehending object oriented systems and evaluating their maintainability, SIGKDD Explorations 8(1), 2006.
. D. Engler, D. Chen, S. Hallem et al., “Bugs as deviant behavior: A general approach to inferring errors in systems code,” ACM SIGOPS Operating Systems Review, vol. 35, no. 5, pp. 57-72, 2001.
. Z. Li, and Y. Zhou, “PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software code,” in Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering, 2005, pp. 306-315.
. S. Lu, S. Park, C. Hu et al., “MUVI: automatically inferring multi-variable access correlations and detecting related semantic and concurrency bugs,” ACM SIGOPS Operating Systems Review, vol. 41, no. 6, pp. 103-116, 2007.
. B. Baker, “On finding duplication and near-duplication in large software systems,” in Second IEEE Working Conf on Reverse Eng.(wcre), 1995, pp. 86- 95.
. T. Kamiya, S. Kusumoto, and K. Inoue, “CCFinder: a multilinguistic tokenbased code clone detection system for large scale source code,” IEEE Transactions on Software Engineering, pp. 654-670, 2002.
. V. Wahler, D. Seipel, J. Wolff et al., “Clone detection in source code by frequent itemset techniques,” in Fourth IEEE International Workshop on Source Code Analysis and Manipulation, 2004, pp. 128-135.
. W. Qu, Y. Jia, and M. Jiang, “Pattern mining of cloned codes in software systems,” Information Sciences, 2010.
. H. A. Basit, and S. Jarzabek, “A data mining approach for detecting higherlevel clones in software,” IEEE Transactions on Software Engineering, pp. 497-514, 2009.
. Z. Li, S. Lu, S. Myagmar et al., “CP-Miner: A tool for finding copy-paste and related bugs in operating system code,” in Proceedings of the 6th conference on Symposium on Operating Systems Design & Implementation, 2004, pp. 20.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Re-users must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. This license allows for redistribution, commercial and non-commercial, as long as the original work is properly credited.