Online ISSN: 2515-8260

Keywords : Clustering


Clustering Analysis from Universities in Indonesia based on Sentiment Analysis

Hendra Achmadi; Isana Meranga; Dewi Wuisan; Irwan Suarly; I Gusti Anom Yudistira; Rudy Pramono

European Journal of Molecular & Clinical Medicine, 2020, Volume 7, Issue 10, Pages 1466-1481

There are two kind of source to determine the quality for a good university in Indonesia. First from university cluster which is publish from Ministry of Research, Technology and Higher Education issued a clustering list of Indonesian universities, the second source of data from social media, such as Twitter. In this research we use Text Mining and Data Mining Methodology to build a sentiment analysis from 50100 Tweet to assess 501 university using Python and special library in Python for Natural Language Processing a sentiment analysis , which is join the university clustering from Ministry of Research, Technology and Higher Education, so it will produce the positive, neutral and negative sentiment for each 501 universities in 2020. The next process by using R STUDIO, the process classification is continued by using K-Means, the process can be devided into two step , step 1 it will process 501 dataset university and it will build 5 cluster and secondly the similarities between Netizen cluster and cluster from Ministry of Research, Technology and Higher Education is 37 %, and step 2 after cleansing the 0 value, the result is 169 universites the similarities between Netizen cluster and cluster from Ministry of Research, Technology and Higher Education is 37 % before and after data cleansing was the same. The novelty knowledge or research finding can be derived from Netizen, firstly, the cluster can be derived based on Positive Sentiment,. Secondly, the cluster from Netizen and Cluster from Directorate General of Higher Education, Ministry of Education and Culture of higher education in Indonesia is only match around 37 % with cluster form Directorate General of Higher Education. And after data cleansing from 169 university was only match around 33 %..

BLENDED KERNEL FUZZY LOCAL INFORMATION C-MEANS (BKFLICM) CLUSTERING BASED EDGE DETECTION FOR LUNG IMAGES

P. Dhanalakshmi; Dr. G. Satyavathy

European Journal of Molecular & Clinical Medicine, 2020, Volume 7, Issue 9, Pages 1920-1937

The medical diagnosis and clinical practice greatly demands medical image classification, an emerging area of research which includes modern medical imaging technology. Recently, Fuzzy Bat Algorithm (FBA) with Mean Weight Convolution Neural Network (MWCNN) algorithm was proposed for Region of Interest (RoI) area detection in the lung images in order to increase the classification accuracy. The image processing system outcomes are influenced by edge detection e.g. region segmentation, objects detection. Edge detection is done through Blended Kernel Based Fuzzy Local Information C-Means (BKFLICM) technique and construction of gradients in the scale is achieved by clustering of all image pixels in a feature space. The image segmentation mainly relies on the pixel intensity which is used for assessing resemblance amidst pixels. The edge detection using BKFLICM is performed by formation of new kernel range which is obtained by merging hyperbolic tangent kernel and Gaussian kernel. The special feature of BKFLICM is the fuzzy local (gray level) similarity measure through the kernel function. This does the edge detection perfectly while preserving the image details following which FBA and MWCNN classifier are utilized for segmentation and classification respectively. The training of lung image classification deprived of severe over-fitting is mainly done through MWCNN with sufficient labelled images and improved accuracy is also obtained for (LIDC-IDRI) database. The performance metrics such as accuracy, precision, recall, and F-measure values are also enhanced using the proposed algorithm which is validated by the experimental outcomes.

Discriminating Anthropometric Characteristics of Malaysian Youth Handball Players

Jeffrey F.L. Low; Siti Musliha Mat-Rasid; Ruaibah Yazani Tengah; Normah Jusoh; Muazu Musa; Norlaila Azura Kosni

European Journal of Molecular & Clinical Medicine, 2020, Volume 7, Issue 2, Pages 6008-6018

This study aimed to determine the morphological profile among youth handball players based on anthropometric measurements and identify the most significant variables that differentiated the players. The samples of 156 male and 157 female were obtained among Malaysian youth handball players, and its main tributaries were evaluated for anthropometric measurements (body weight, standing height, body mass index, leg length and arm span). Multivariate methods of Hierarchical Agglomerative Cluster Analysis (HACA) and Discriminant Analysis (DA) were used to determine the groups and studied the variations of the most significant anthropometric variables. Three clusters of morphological characteristics (BS1, BS2 and BS3) in handball were shaped in view by HACA for male and female players. HACA assigned 41, 83 and 32 male players in BS1, BS2 and BS3 clusters, respectively. Meanwhile, a total of 63, 79 and 13 female players were assigned in BS1, BS2 and BS3 clusters, respectively. For male players, the percentage of classification correctness using standard mode is 97.44% with five significant variables (body weight, standing height, body mass index, leg length and armspan). Forward stepwise DA revealed 96.79% correctness with only two significant variables (body weight and arm span), while backward stepwise DA revealed 97.44% classification correctness with four significant variables out of five by removing leg length parameter. For female players, the classification correctness using standard mode is 93.63% with five significant (body weight, standing height, body mass index, leg length and armspan). Forward stepwise DA revealed 94.27% correctness with only two significant variables (body weight and arm span), while backward stepwise DA revealed 93.63% classification correctness with four significant variables out of five by removing also leg length parameter. Information on the physical characteristics of players can help coaches appointed the appropriate position according to their morphological profile category. This approach, in the long run, is beneficial to reduce the time, save manpower and make decisions scientifically.

DTW SIMILARITY MEASURE BASED U-SHAPELETS CLUSTERING ALGORITHM FOR TIME-SERIES DATA

Arathi M; Govardhan A

European Journal of Molecular & Clinical Medicine, 2020, Volume 7, Issue 2, Pages 3378-3392

Time-Series Analysis exhibitedefficient results in delivering significant knowledge in numerous domains.
Most of the investigationon Time-Series Analysis is restrictedwith the
requirementofexpensivecategorized information. This led tothe growth of curiosity in groupingthe timeseries
informationthat does not need any access to categorized information. The clustering time-series
informationcarries out issues that donot prevail in conventional clustering methodologies.,in the
Euclidean space amongst the objects.Therefore,the authorsuggested an innovativeclustertechnique,
forTime-Seriesemploying of DTW similarity measure by extracting unsupervised shapelets. And these
extracted u-shapelets are clustered employing iterative k-means algorithm. The DTW similarity measure
provides better accuracy in formed clusters of proposed methodology compared tothe Metric
EuclidianDistance Measure. The performance of the suggested approach is evaluated employing theRand
Index (RI) Measure. The experimental for this approach was performed on four different Time-Series
data samples and the outcomes showed that the RI measure for the DTW based Time-Series Clustering
Algorithm is more when compared to the Existing ED-basedTime-Series Clustering Algorithm.

An Enhanced Multipath Relay Node Selection Strategy Using Modified Multipath Routing Protocol In MANET

Yamini Swathi L; P S V Subba Rao; K Samatha

European Journal of Molecular & Clinical Medicine, 2020, Volume 7, Issue 5, Pages 1072-1087

In despite of the geographical location, internet connection is provided always and everywhere with the implications of designing of Mobile ad hoc network (MANET). Different types of applications of MANETs are included environment monitoring, military and disaster recovery. The resource-constrained environment of MANET is not allowed to perform the communication processes easily. For the network nodes, the limited batteries are utilized as an equipment. Throughout this process, the major challenging issue is replacing and recharging of these batteries. Within the MANET, the nodes are added without considering the circumstances. To process the communication among nodes, the trustworthy and reliable techniques should be inculcated. The definition of trustworthiness is about the opinion of a node on the other node with the numerical representation. The trust is computed based on the previous communication among current nodes. To address the limitations, a technique of modified multipath routing is needed. By using the network layer, efficiency is achieved in terms of energy utilization as the MANET is an infrastructure-less network and a peer-to-peer network. The routing path is chosen according to the network nodes’ current residual condition with the improved modified multipath routing protocol. The proposed technique is performed efficiently in terms of network stability and network’s lifetime than the existing methods like MRPC and E-AODV. To determine the proposed method’s effectiveness, NS2 software is utilized to assess the simulation results.

A Novel Approach For Predicting Drug Response Similarity Using Machine Learning

M Supriya Menon; P Raja Rajeswari

European Journal of Molecular & Clinical Medicine, 2020, Volume 7, Issue 8, Pages 796-808

Medical domain is revolutionized in terms of Diseases, Diagnosis, and Treatment Prediction thereby undergoing immense pressure due to the high dimensionality of numerable multivariate attributes, suppressing the quality of the analysis. Many techniques like Clustering and Classification have ruled over despite, rendering few hairline gaps towards attaining maximum efficiency. Our Machine Learning-based approach heads towards filling these gaps by adopting advanced K-Means in anticipating Drug likelihood in core attributes of Patients. The proposed Methodology focuses on determining Drug Response similarity by enhanced clustering technique concerning sensitive attributes of Patients. We successfully demonstrated its performance on the UCI Patient dataset reflecting enhanced results concerning Quality Parameters.

Development of Top K-Association Rule Mining for Discovering pattern in Medical Dataset

Aakriti Sharma; Anjana Sangwan; Blessy Thankchan; Sachin Jain; Veenita Singh; Shantanu Saurabh

European Journal of Molecular & Clinical Medicine, 2020, Volume 7, Issue 4, Pages 1413-1421

Association rules consist of the discovery of association between mining transaction items. This is one of the most important information mining jobs. It has been integrated into many commercial data mining software and has a wide variety of applications on a number of domains. So, computing the prediction rules in top rank data set is very difficult task. Finding the pattern in large data set require memory computational power high rate of I/O. and it is possible only on high computational machine. In this paper, selection of parameter which is used to compute is chosen based on minimum support and minimum confidence value. In this paper proposed a new algorithm which generates the association rule for the input parameters to finding the pattern in large data set. The algorithm starts searching the rules. As soon as a rule is found, it is added to the list of order rules list by support. The list is used so far to maintain top N rules found. Once valid rules are found, the minimum support for the internal minsup variable list is raised to support the rule. When the Minsup value is raised, the search space is robbed while searching for more rules. Then, every time a valid rule is found, the list is inserted into the list, the lists that are not listed in the list are excluded from the list and the minsup is raised for the price of the least fun rules in the list. Result shows that new method is efficient technique to mine data set from standard data with minimum configuration system.