A Survey of Document Ranking and Similarity Using Combination of Various Matching Function
Keywords:
Combined Matching Function, Similarity Measure, Databases, Classification, cosine-JaccardAbstract
The Volume of information in this world of digitalization is so vast and present in various forms. The major problem we face related to all these information sets is their organization. To use this information effective and efficiently we categorize or classified them according to their specialization. Without categorizing garbing the relevant information is not an easy task. To make it easy different methods are applied and these methods allow the user to take and put the specific information or document quickly into their respective database. The main objective of this paper is to use combination of cosine-Jaccard ,Jaccard-dice and cosine-dice matching function to find the similarity between documents and ranking them according to their similarity into their respective database and store them into the appropriate classification.
References
Jasmine Irani et al. , “clustering Techniques and Similarity Measures used in Clustering : A Survey” , International Journal of Computer Application(0975-8887),volume 134 ,No.7 ,pp 19-28 , January 2016.
M.K.Vijaymeena and K.Kavitha, “A Survey On Similarity Measures In Text Mining”, Machine Learning and Applications: An International Journal (MLAIJ), vol 3,no.1,pp 19-28, march 2016.
Satya P Kumar Somayajula et al. , “ Application of the concept-based similarity measure in topic detection” , International Journal of computer science and Information Technology , ISSN 0975-9646 , Vol 2(4) , pp 1743-1746 , 2011.
Manan Mohan Goyal , Neha Agrawal et al., “Comparision Clustering Using Cosine anf Fuzzy Set based Similarity Measures of Text Documents” International Conference on Computing and Communication Systems 2015 (I3CS'15), ISBM: 978-1-4799-5857-01, 2015
R.Umamaheswari and K. Rajesh , “Text Clustering Using Cosine similarity and Matrix Factorization” , International Journal of Research in Computer and Communication Technology , ISSN(0) 2278-5841 , ISSN(P) 2320-5156 , Vol 3 , Issue 10 ,pp 1343-1347,October 2014.
Mirza ruhi Masuma et al. , “Text Classification and Clustering through Similarity Measures” , International Journal of Latest Technology in Engineering, Management & Applied Science , ISSN 2278-2540 , Volume V , Issue III,pp 91-94 , March 2016.
Pragati Bhatnagar and N.K. Pareek, “ A combined matching function based evolutionary approach for development of adaptive information retrieval system”, International Journal of Emerging Technology and Advanced Engineering, ISSN 2250-2459, vol. 2, no. 6,pp. 249-256, Jun. 2012.
D.Renukadevi and S.Sumathi , “ Term Based Similarity Measure For Text Classification and Clustering Using Fuzzy C-Means algorithm” , International Journal of Science And Technology Research , ISSN 2278-7798 , Volume 3 , Issue 4 , pp 1093-1096 , April 2014.
P.Sowmya Lakshmi et al. , “Different Similarity Measure For Text Classification Using Knn” , IOSR Journal of Computer Engineering , ISSN 2278-0661 , ISBN 2278-8727 , Volume 5 , Issue 6 , pp 30-36 ,Sep-Oct 2012.
S. Brin and L. Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” Proc. Seventh Int’l Conf. World Wide Web (WWW ’98), pp. 107-117, 1998.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2018 International Journal for Research Publication and Seminar
This work is licensed under a Creative Commons Attribution 4.0 International License.
Re-users must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. This license allows for redistribution, commercial and non-commercial, as long as the original work is properly credited.