Web Document Clustering Approach Base on Improvise Fuzzy Clustering using Cosine Similarity and Name Entity Recognition Method
Kalyani Ramesh Pole
, Vishakha R. Mote
collaborative filtering and information filtering; Web content; linguistic topological space; Name Entity Recognition, Natural Language Processing
Recent advances in computers and technology have resulted in a growing body of documents. The need is to classify the set of documents according to type. Placing related documents together is convenient for making decisions. Researchers conducting interdisciplinary research acquire repositories on different topics. The classification of the repositories according to the theme is a real need to analyze the research work. The experiments are tested on different sets of real and artificial data, such as NEWS 20, Reuters, emails, research on different topics. The term frequency inverse document frequency algorithm is used together with the fuzzy hierarchical algorithm and K-means. Initially, the experiment is being carried out in small data sets and cluster analysis was performed. The best algorithm applies to the extended data set. Together with the different groups of related documents, the resulting coefficient and the trend of measure F are presented to show the behavior of the algorithm for each data set. Our model combines two components: a mixing component used to discover latent groups in the collection of documents and a theme model component used to mine multigrain issues, including cluster-specific local issues and global topics shared between clusters. We use the variational inference to approximate the posterior part of the hidden variables and learn the parameters of the model. The experiments in two data sets demonstrate the effectiveness of our model.
"Web Document Clustering Approach Base on Improvise Fuzzy Clustering using Cosine Similarity and Name Entity Recognition Method", IJSDR - International Journal of Scientific Development and Research (www.IJSDR.org), ISSN:2455-2631, Vol.2, Issue 12, page no.93 - 101, December-2017, Available :https://ijsdr.org/papers/IJSDR1712014.pdf
Volume 2
Issue 12,
December-2017
Pages : 93 - 101
Paper Reg. ID: IJSDR_170875
Published Paper Id: IJSDR1712014
Downloads: 000347032
Research Area: Engineering
Country: aurangabad, maharashtra, India
ISSN: 2455-2631 | IMPACT FACTOR: 9.15 Calculated By Google Scholar | ESTD YEAR: 2016
An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 9.15 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator
Publisher: IJSDR(IJ Publication) Janvi Wave