• Register
  • Login

European Journal of Molecular & Clinical Medicine

  • Home
  • Browse
    • Current Issue
    • By Issue
    • By Subject
    • Keyword Index
    • Author Index
    • Indexing Databases XML
  • Journal Info
    • About Journal
    • Aims and Scope
    • Editorial Board
    • Publication Ethics
    • Indexing and Abstracting
    • Peer Review Process
    • News
  • Guide for Authors
  • Submit Manuscript
  • Contact Us
Advanced Search

Notice

As part of Open Journals’ initiatives, we create website for scholarly open access journals. If you are responsible for this journal and would like to know more about how to use the editorial system, please visit our website at https://ejournalplus.com or
send us an email to info@ejournalplus.com

We will contact you soon

  1. Home
  2. Volume 8, Issue 2
  3. Authors

Online ISSN: 2515-8260

Volume8, Issue2

Improved Sampling Data Workflow Using Smtmk To Increase The Classification Accuracy Of Imbalanced Dataset

    Muhammad Syafiq Alza bin Alias Norazlin Binti Ibrahim Zalhan Bin Mohd Zin

European Journal of Molecular & Clinical Medicine, 2021, Volume 8, Issue 2, Pages 91-99

  • Show Article
  • Download
  • Cite
  • Statistics
  • Share

Abstract

One of the main challenges in machine learning classification is handling
imbalanced data because imbalanced data can produce result bias towards the majority
class and a poor performance of classification. Therefore, in this paper, an improved
workflow is introduced to cater this issue. After combination of Synthetic Minority Oversampling
Technique (SMOTE) and Tomek Links or known as SMTmk method is
performed, additional step is required to further increase the performance of machine
learning classification especially in Specificity field. The step is completed by reducing the
number of majority class based on the ratio of minority class. Three machine learning
algorithms is used to test the classification result which are Extreme Gradient Boosting,
Random Forest and Logistic Regression. Result recorded in this research shows that the
ratio of 7 to 1 is better than the established methods which are SMOTE and hybrid method
of SMOTE and Tomek Links.
Keywords:
    SMOTE, Tomek Links Imbalance Data Machine learning
  • PDF (336 K)
  • XML
(2021). Improved Sampling Data Workflow Using Smtmk To Increase The Classification Accuracy Of Imbalanced Dataset. European Journal of Molecular & Clinical Medicine, 8(2), 91-99.
Muhammad Syafiq Alza bin Alias; Norazlin Binti Ibrahim; Zalhan Bin Mohd Zin. "Improved Sampling Data Workflow Using Smtmk To Increase The Classification Accuracy Of Imbalanced Dataset". European Journal of Molecular & Clinical Medicine, 8, 2, 2021, 91-99.
(2021). 'Improved Sampling Data Workflow Using Smtmk To Increase The Classification Accuracy Of Imbalanced Dataset', European Journal of Molecular & Clinical Medicine, 8(2), pp. 91-99.
Improved Sampling Data Workflow Using Smtmk To Increase The Classification Accuracy Of Imbalanced Dataset. European Journal of Molecular & Clinical Medicine, 2021; 8(2): 91-99.
  • RIS
  • EndNote
  • BibTeX
  • APA
  • MLA
  • Harvard
  • Vancouver
  • Article View: 230
  • PDF Download: 643
  • LinkedIn
  • Twitter
  • Facebook
  • Google
  • Telegram
Journal Information

Publisher:

Email:  editor.ejmcm21@gmail.com

  • Home
  • Glossary
  • News
  • Aims and Scope
  • Privacy Policy
  • Sitemap

 

For Special Issue Proposal : editor.ejmcm21@gmail.com

This journal is licensed under a Creative Commons Attribution 4.0 International (CC-BY 4.0)

Powered by eJournalPlus