• Register
  • Login

European Journal of Molecular & Clinical Medicine

  1. Home
  2. Improved Sampling Data Workflow Using Smtmk To Increase The Classification Accuracy Of Imbalanced Dataset

Current Issue

By Issue

By Author

By Subject

Author Index

Keyword Index

About Journal

Aims and Scope

Editorial Board

Publication Ethics

Indexing and Abstracting

Related Links

FAQ

Peer Review Process

Journal Metrics

News

Improved Sampling Data Workflow Using Smtmk To Increase The Classification Accuracy Of Imbalanced Dataset

    Authors

    • Muhammad Syafiq Alza bin Alias
    • Norazlin Binti Ibrahim
    • Zalhan Bin Mohd Zin

    Industrial Automation Section, UniKL Malaysia France Institute, Bangi, Malaysia

,

Document Type : Research Article

  • Article Information
  • Download
  • Export Citation
  • Statistics
  • Share

Abstract

One of the main challenges in machine learning classification is handling
imbalanced data because imbalanced data can produce result bias towards the majority
class and a poor performance of classification. Therefore, in this paper, an improved
workflow is introduced to cater this issue. After combination of Synthetic Minority Oversampling
Technique (SMOTE) and Tomek Links or known as SMTmk method is
performed, additional step is required to further increase the performance of machine
learning classification especially in Specificity field. The step is completed by reducing the
number of majority class based on the ratio of minority class. Three machine learning
algorithms is used to test the classification result which are Extreme Gradient Boosting,
Random Forest and Logistic Regression. Result recorded in this research shows that the
ratio of 7 to 1 is better than the established methods which are SMOTE and hybrid method
of SMOTE and Tomek Links.

Keywords

  • SMOTE, Tomek Links
  • Imbalance Data
  • Machine Learning
  • XML
  • PDF 336.37 K
  • RIS
  • EndNote
  • Mendeley
  • BibTeX
  • APA
  • MLA
  • HARVARD
  • VANCOUVER
    • Article View: 276
    • PDF Download: 700
European Journal of Molecular & Clinical Medicine
Volume 8, Issue 2
January 2021
Page 91-99
Files
  • XML
  • PDF 336.37 K
Share
Export Citation
  • RIS
  • EndNote
  • Mendeley
  • BibTeX
  • APA
  • MLA
  • HARVARD
  • VANCOUVER
Statistics
  • Article View: 276
  • PDF Download: 700

APA

Alias, M. S. A. B., Ibrahim, N. B., & Zin, Z. B. M. (2021). Improved Sampling Data Workflow Using Smtmk To Increase The Classification Accuracy Of Imbalanced Dataset. European Journal of Molecular & Clinical Medicine, 8(2), 91-99.

MLA

Muhammad Syafiq Alza bin Alias; Norazlin Binti Ibrahim; Zalhan Bin Mohd Zin. "Improved Sampling Data Workflow Using Smtmk To Increase The Classification Accuracy Of Imbalanced Dataset". European Journal of Molecular & Clinical Medicine, 8, 2, 2021, 91-99.

HARVARD

Alias, M. S. A. B., Ibrahim, N. B., Zin, Z. B. M. (2021). 'Improved Sampling Data Workflow Using Smtmk To Increase The Classification Accuracy Of Imbalanced Dataset', European Journal of Molecular & Clinical Medicine, 8(2), pp. 91-99.

VANCOUVER

Alias, M. S. A. B., Ibrahim, N. B., Zin, Z. B. M. Improved Sampling Data Workflow Using Smtmk To Increase The Classification Accuracy Of Imbalanced Dataset. European Journal of Molecular & Clinical Medicine, 2021; 8(2): 91-99.

  • Home
  • About Journal
  • Editorial Board
  • Submit Manuscript
  • Contact Us
  • Glossary
  • Sitemap

News

 

For Special Issue Proposal : editor.ejmcm21@gmail.com

Newsletter Subscription

Subscribe to the journal newsletter and receive the latest news and updates

© Journal Management System. Powered by ejournalplus