A class skew-insensitive ACO-based decision tree algorithm for imbalanced data sets
Ant-tree-miner (ATM) has an advantage over the conventional decision tree algorithm in terms of feature selection. However, real world applications commonly involved imbalanced class problem where the classes have different importance. This condition impeded the entropy-based heuristic of existing A...
Published in: | Indonesian Journal of Electrical Engineering and Computer Science |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Published: |
Institute of Advanced Engineering and Science
2021
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85092611049&doi=10.11591%2fijeecs.v21.i1.pp412-419&partnerID=40&md5=dcbaef833699b93e12d9d5b29673a763 |
id |
2-s2.0-85092611049 |
---|---|
spelling |
2-s2.0-85092611049 Mohd Razali M.H.B.; Saian R.B.; Wah Y.B.; Ku-Mahamud K.R. A class skew-insensitive ACO-based decision tree algorithm for imbalanced data sets 2021 Indonesian Journal of Electrical Engineering and Computer Science 21 1 10.11591/ijeecs.v21.i1.pp412-419 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85092611049&doi=10.11591%2fijeecs.v21.i1.pp412-419&partnerID=40&md5=dcbaef833699b93e12d9d5b29673a763 Ant-tree-miner (ATM) has an advantage over the conventional decision tree algorithm in terms of feature selection. However, real world applications commonly involved imbalanced class problem where the classes have different importance. This condition impeded the entropy-based heuristic of existing ATM algorithm to develop effective decision boundaries due to its biasness towards the dominant class. Consequently, the induced decision trees are dominated by the majority class which lack in predictive ability on the rare class. This study proposed an enhanced algorithm called Hellinger-Ant-tree-miner (HATM) which is inspired by ant colony optimization (ACO) metaheuristic for imbalanced learning using decision tree classification algorithm. The proposed algorithm was compared to the existing algorithm, ATM in nine (9) publicly available imbalanced data sets. Simulation study reveals the superiority of HATM when the sample size increases with skewed class (Imbalanced Ratio < 50%). Experimental results demonstrate the performance of the existing algorithm measured by BACC has been improved due to the class skew-insensitiveness of Hellinger Distance. The statistical significance test shows that HATM has higher mean BACC score than ATM. © 2021 Institute of Advanced Engineering and Science. All rights reserved. Institute of Advanced Engineering and Science 25024752 English Article All Open Access; Gold Open Access; Green Open Access |
author |
Mohd Razali M.H.B.; Saian R.B.; Wah Y.B.; Ku-Mahamud K.R. |
spellingShingle |
Mohd Razali M.H.B.; Saian R.B.; Wah Y.B.; Ku-Mahamud K.R. A class skew-insensitive ACO-based decision tree algorithm for imbalanced data sets |
author_facet |
Mohd Razali M.H.B.; Saian R.B.; Wah Y.B.; Ku-Mahamud K.R. |
author_sort |
Mohd Razali M.H.B.; Saian R.B.; Wah Y.B.; Ku-Mahamud K.R. |
title |
A class skew-insensitive ACO-based decision tree algorithm for imbalanced data sets |
title_short |
A class skew-insensitive ACO-based decision tree algorithm for imbalanced data sets |
title_full |
A class skew-insensitive ACO-based decision tree algorithm for imbalanced data sets |
title_fullStr |
A class skew-insensitive ACO-based decision tree algorithm for imbalanced data sets |
title_full_unstemmed |
A class skew-insensitive ACO-based decision tree algorithm for imbalanced data sets |
title_sort |
A class skew-insensitive ACO-based decision tree algorithm for imbalanced data sets |
publishDate |
2021 |
container_title |
Indonesian Journal of Electrical Engineering and Computer Science |
container_volume |
21 |
container_issue |
1 |
doi_str_mv |
10.11591/ijeecs.v21.i1.pp412-419 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85092611049&doi=10.11591%2fijeecs.v21.i1.pp412-419&partnerID=40&md5=dcbaef833699b93e12d9d5b29673a763 |
description |
Ant-tree-miner (ATM) has an advantage over the conventional decision tree algorithm in terms of feature selection. However, real world applications commonly involved imbalanced class problem where the classes have different importance. This condition impeded the entropy-based heuristic of existing ATM algorithm to develop effective decision boundaries due to its biasness towards the dominant class. Consequently, the induced decision trees are dominated by the majority class which lack in predictive ability on the rare class. This study proposed an enhanced algorithm called Hellinger-Ant-tree-miner (HATM) which is inspired by ant colony optimization (ACO) metaheuristic for imbalanced learning using decision tree classification algorithm. The proposed algorithm was compared to the existing algorithm, ATM in nine (9) publicly available imbalanced data sets. Simulation study reveals the superiority of HATM when the sample size increases with skewed class (Imbalanced Ratio < 50%). Experimental results demonstrate the performance of the existing algorithm measured by BACC has been improved due to the class skew-insensitiveness of Hellinger Distance. The statistical significance test shows that HATM has higher mean BACC score than ATM. © 2021 Institute of Advanced Engineering and Science. All rights reserved. |
publisher |
Institute of Advanced Engineering and Science |
issn |
25024752 |
language |
English |
format |
Article |
accesstype |
All Open Access; Gold Open Access; Green Open Access |
record_format |
scopus |
collection |
Scopus |
_version_ |
1809677598277500928 |