Online ISSN: 2515-8260

Keywords : gene. protein entities

A Probabilistic Key phrase extraction approach on large biomedical documents

Jose Mary Golamari; D. Haritha

European Journal of Molecular & Clinical Medicine, 2020, Volume 7, Issue 3, Pages 4309-4322

As the size of the biomedical databases are increasing day-by-day, finding an essential feature set for classification problem is complex due to large data size and sparsity problems. Text feature ranking and clustering is one of the major challenges to scientific and medical researchers due to its high dimensional feature space and limited number of samples. High dimensionality of the feature space is one of the major issues in biomedical document clustering due to large number of candidates sets. Selection of high probabilistic features for clustering is therefore essential for biomedical document analysis such as classification and clustering. In this paper, a novel probabilistic key phrase extraction and preprocessing model is designed and implemented on large number of biomedical documents. In this framework, a novel key-phrase extraction method is used to filter the large biomedical document sets. Experimental results show that the present key phrase extraction approach is better than existing key-phrase extraction approaches in terms of runtime and accuracy are concerned.