Machine Learning Techniques for Distinguishing Android Malware Variants

The advancement of portable devices has been quickly and dramatically reshaping the usage trend and consumer preferences of electronic devices. Android, the most common mobile operating system, has a privilege-separated protection system with a complex access control mechanism. Android apps require...

Full description

Bibliographic Details
Published in:Journal of Applied Data Sciences
Main Author: Irwansyah I.; Kurniawan T.B.; Dewi D.A.; Zakaria M.Z.; Azmi N.B.
Format: Article
Language:English
Published: Bright Publisher 2025
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85216792805&doi=10.47738%2fjads.v6i1.493&partnerID=40&md5=6f45c832f251948849b4c6974ef35868
id 2-s2.0-85216792805
spelling 2-s2.0-85216792805
Irwansyah I.; Kurniawan T.B.; Dewi D.A.; Zakaria M.Z.; Azmi N.B.
Machine Learning Techniques for Distinguishing Android Malware Variants
2025
Journal of Applied Data Sciences
6
1
10.47738/jads.v6i1.493
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85216792805&doi=10.47738%2fjads.v6i1.493&partnerID=40&md5=6f45c832f251948849b4c6974ef35868
The advancement of portable devices has been quickly and dramatically reshaping the usage trend and consumer preferences of electronic devices. Android, the most common mobile operating system, has a privilege-separated protection system with a complex access control mechanism. Android apps require permission to get access to confidential personal data and device resources. However, studies have shown that various malicious applications can acquire permission and target systems and applications by misleading users. In this study, we suggest a machine-learning approach to classifying Android malware variants by mining requested permissions, real permissions, suspicious calls, and API calls that were obtained and used in Android malware applications. Selected features were selected using a feature selection called KBest. Feature selection techniques are used to minimize the scale of the features and increase the performance. Two types of Naïve Bayes classifiers, called Multinomial distribution and multivariate Bernoulli distribution, are used and compared in malware family classification for text classification. Both naïve Bayes types are evaluated using a confusion matrix based on 4022 Android malware applications belonging to 10 families. Experimental findings show that the Multinomial distribution offers a reliable performance from three tests experiment with an average accuracy of 95%. © 2024, Bright Publisher. All rights reserved.
Bright Publisher
27236471
English
Article
All Open Access; Gold Open Access
author Irwansyah I.; Kurniawan T.B.; Dewi D.A.; Zakaria M.Z.; Azmi N.B.
spellingShingle Irwansyah I.; Kurniawan T.B.; Dewi D.A.; Zakaria M.Z.; Azmi N.B.
Machine Learning Techniques for Distinguishing Android Malware Variants
author_facet Irwansyah I.; Kurniawan T.B.; Dewi D.A.; Zakaria M.Z.; Azmi N.B.
author_sort Irwansyah I.; Kurniawan T.B.; Dewi D.A.; Zakaria M.Z.; Azmi N.B.
title Machine Learning Techniques for Distinguishing Android Malware Variants
title_short Machine Learning Techniques for Distinguishing Android Malware Variants
title_full Machine Learning Techniques for Distinguishing Android Malware Variants
title_fullStr Machine Learning Techniques for Distinguishing Android Malware Variants
title_full_unstemmed Machine Learning Techniques for Distinguishing Android Malware Variants
title_sort Machine Learning Techniques for Distinguishing Android Malware Variants
publishDate 2025
container_title Journal of Applied Data Sciences
container_volume 6
container_issue 1
doi_str_mv 10.47738/jads.v6i1.493
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85216792805&doi=10.47738%2fjads.v6i1.493&partnerID=40&md5=6f45c832f251948849b4c6974ef35868
description The advancement of portable devices has been quickly and dramatically reshaping the usage trend and consumer preferences of electronic devices. Android, the most common mobile operating system, has a privilege-separated protection system with a complex access control mechanism. Android apps require permission to get access to confidential personal data and device resources. However, studies have shown that various malicious applications can acquire permission and target systems and applications by misleading users. In this study, we suggest a machine-learning approach to classifying Android malware variants by mining requested permissions, real permissions, suspicious calls, and API calls that were obtained and used in Android malware applications. Selected features were selected using a feature selection called KBest. Feature selection techniques are used to minimize the scale of the features and increase the performance. Two types of Naïve Bayes classifiers, called Multinomial distribution and multivariate Bernoulli distribution, are used and compared in malware family classification for text classification. Both naïve Bayes types are evaluated using a confusion matrix based on 4022 Android malware applications belonging to 10 families. Experimental findings show that the Multinomial distribution offers a reliable performance from three tests experiment with an average accuracy of 95%. © 2024, Bright Publisher. All rights reserved.
publisher Bright Publisher
issn 27236471
language English
format Article
accesstype All Open Access; Gold Open Access
record_format scopus
collection Scopus
_version_ 1825722576471588864