An Analysis of Enhancement in K-Means Clustering

Authors

  • Preeti saini Research Scholar, Department of CSE, JIET Jind
  • Sapna aggarwal Assistant professor, Department of CSE, JIET Jind.

Keywords:

algorithm, Regression, Clustering, Data mining

Abstract

Today in modern era, everyone has to retrieve the large amount of data from a vast collection of data. This process of retrieving useful data in understandable form is data mining. Big data[6] is a term for data sets that are so large or complex that old data processing applications are insufficient. Accuracy in big data might lead to more confident decision making, & better decisions could result in greater operational efficiency, cost reduction & reduced risk. Various algorithms[5] & techniques like Classification, Regression, Artificial Intelligence, Neural Networks, Association Rules, Decision Trees, Algorithm, Nearest Neighbour approach are used for knowledge discovery from databases. Clustering is an important data analytic technique which have a significant role in data mining application. Clustering is the technique of arranging a set of similar objects into a group. Partition based clustering is an important clustering technique. This technique is centroid based technique in which data points splits into k partition and each partition represents a cluster. A widely used partition based clustering algorithm is k- means clustering algorithm. But this method has problem of empty cluster. The problems could be reduced by using an enhanced algorithm. In this paper, we have analysis of the old k-means algorithm and an enhanced k-means algorithm.

References

Piatetsky-Shapiro, Gregory (1991), Discovery, analysis, & presentation of strong rules, in Piatetsky-Shapiro,

Gregory; & Frawley, William J.; eds., Knowledge Discovery in Databases, AAAI/MIT Press, Cambridge, MA.

Agrawal, R.; Imieliński, T.; Swami, A. (1993). "Mining association rules between sets of items in large databases". Proceedings of 1993 ACM SIGMOD

international conference on Management of data - SIGMOD '93. p. 207. doi:10.1145/170035.170072. ISBN 0897915925.

Hahsler, Michael (2005). "Introduction to arules – A computational environment for mining association rules & frequent item sets" (PDF). Journal of Statistical Software.

Michael Hahsler (2015). A Probabilistic Comparison of Commonly Used Interest Measures for Association Rules. http://michael.hahsler.net/research/association_rules/measures.html

Hipp, J.; Güntzer, U.; Nakhaeizadeh, G. (2000). "Algorithms for association rule mining --- a general survey & comparison". ACM SIGKDD Explorations Newsletter 2: 58. doi:10.1145/360402.360421.

Tan, Pang-Ning; Michael, Steinbach; Kumar, Vipin (2005). "Chapter 6. Association Analysis: Basic Concepts & Algorithms" (PDF). Introduction to Data Mining. Addison-Wesley. ISBN 0-321-32136-7.

Pei, Jian; Han, Jiawei; & Lakshmanan, Laks V. S.; Mining frequent itemsets with convertible constraints, in Proceedings of 17th International Conference on Data Engineering, April 2–6, 2001, Heidelberg, Germany, 2001, pages 433-442

Agrawal, Rakesh; & Srikant, Ramakrishnan; Fast algorithms for mining association rules in large databases, in Bocca, Jorge B.; Jarke, Matthias; & Zaniolo, Carlo; editors, Proceedings of 20th International Conference on Very Large Data Bases (VLDB), Santiago, Chile, September 1994, pages 487-499

Zaki, M. J. (2000). "Scalable algorithms for association mining". IEEE Transactions on Knowledge & Data Engineering 12 (3): 372–390. doi:10.1109/69.846291.

Downloads

Published

31-12-2016

How to Cite

Preeti saini, & Sapna aggarwal. (2016). An Analysis of Enhancement in K-Means Clustering. International Journal for Research Publication and Seminar, 7(9). Retrieved from https://jrps.shodhsagar.com/index.php/j/article/view/987

Issue

Section

Original Research Article