Author : Jeyavathana, R.Beaulah
European Journal of Molecular & Clinical Medicine,
2020, Volume 7, Issue 4, Pages 1780-1784
In traditional unstructured text clustering techniques mostly, they use vector space model usually considers all the text documents as bags of words where the word sequence are not considered for efficient clustering. For Cluster quality the order of terms in the document collection plays a main role in which vector space model do not support. Hence recent days for the text clustering usually done through frequent item based. This paper analysis the different techniques like Frequent word sequence (FWS), Frequent item based on maximum document occurrence (FIMDO) and weighted frequent utility pattern agglomerative clustering(WFUPAC) and evaluates the input datasets like newsgroup and Reuters dataset with varying size. The result proves that the weighted frequent utility pattern agglomerative clustering (WFUPAC) outperforms when compared to Frequent word sequence (FWS) and Frequent item based on maximum document occurrence(FIMDO).Thus enhances the accuracy of text clustering in big data environment.