Speech-based gender recognition using linear prediction and mel-frequency cepstral coefficients
Gender discrimination and awareness are essentially practiced in social, education, workplace, and economic sectors across the globe. A person manifests this attribute naturally in gait, body gesture, facial, including speech. For that reason, automatic gender recognition (AGR) has become an interes...
Published in: | Indonesian Journal of Electrical Engineering and Computer Science |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Published: |
Institute of Advanced Engineering and Science
2022
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85139389165&doi=10.11591%2fijeecs.v28.i2.pp753-761&partnerID=40&md5=4ab9d7ba2e20d66755d8c80ffaf711b7 |
id |
2-s2.0-85139389165 |
---|---|
spelling |
2-s2.0-85139389165 Ali Y.M.; Noorsal E.; Mokhtar N.F.; Saad S.Z.M.; Abdullah M.H.; Chin L.C. Speech-based gender recognition using linear prediction and mel-frequency cepstral coefficients 2022 Indonesian Journal of Electrical Engineering and Computer Science 28 2 10.11591/ijeecs.v28.i2.pp753-761 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85139389165&doi=10.11591%2fijeecs.v28.i2.pp753-761&partnerID=40&md5=4ab9d7ba2e20d66755d8c80ffaf711b7 Gender discrimination and awareness are essentially practiced in social, education, workplace, and economic sectors across the globe. A person manifests this attribute naturally in gait, body gesture, facial, including speech. For that reason, automatic gender recognition (AGR) has become an interesting sub-topic in speech recognition systems that can be found in many speech technology applications. However, retrieving salient gender-related information from a speech signal is a challenging problem since speech contains abundant information apart from gender. The paper intends to compare the performance of human vocal tract-based model i.e., linear prediction coefficients (LPC) and human auditory-based model i.e., Mel-frequency cepstral coefficients (MFCC) which are popularly used in other speech recognition tasks by experimentation of optimal feature parameters and classifier’s parameters. The audio data used in this study was obtained from 93 speakers uttering selected words with different vowels. The two feature vectors were tested using two classification algorithms namely, discriminant analysis (DA) and artificial neural network (ANN). Although the experimental results were promising using both feature parameters, the best overall accuracy rate of 97.07% was recorded using MFCC-ANN techniques with almost equal performance for male and female classes. © 2022 Institute of Advanced Engineering and Science. All rights reserved. Institute of Advanced Engineering and Science 25024752 English Article All Open Access; Gold Open Access |
author |
Ali Y.M.; Noorsal E.; Mokhtar N.F.; Saad S.Z.M.; Abdullah M.H.; Chin L.C. |
spellingShingle |
Ali Y.M.; Noorsal E.; Mokhtar N.F.; Saad S.Z.M.; Abdullah M.H.; Chin L.C. Speech-based gender recognition using linear prediction and mel-frequency cepstral coefficients |
author_facet |
Ali Y.M.; Noorsal E.; Mokhtar N.F.; Saad S.Z.M.; Abdullah M.H.; Chin L.C. |
author_sort |
Ali Y.M.; Noorsal E.; Mokhtar N.F.; Saad S.Z.M.; Abdullah M.H.; Chin L.C. |
title |
Speech-based gender recognition using linear prediction and mel-frequency cepstral coefficients |
title_short |
Speech-based gender recognition using linear prediction and mel-frequency cepstral coefficients |
title_full |
Speech-based gender recognition using linear prediction and mel-frequency cepstral coefficients |
title_fullStr |
Speech-based gender recognition using linear prediction and mel-frequency cepstral coefficients |
title_full_unstemmed |
Speech-based gender recognition using linear prediction and mel-frequency cepstral coefficients |
title_sort |
Speech-based gender recognition using linear prediction and mel-frequency cepstral coefficients |
publishDate |
2022 |
container_title |
Indonesian Journal of Electrical Engineering and Computer Science |
container_volume |
28 |
container_issue |
2 |
doi_str_mv |
10.11591/ijeecs.v28.i2.pp753-761 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85139389165&doi=10.11591%2fijeecs.v28.i2.pp753-761&partnerID=40&md5=4ab9d7ba2e20d66755d8c80ffaf711b7 |
description |
Gender discrimination and awareness are essentially practiced in social, education, workplace, and economic sectors across the globe. A person manifests this attribute naturally in gait, body gesture, facial, including speech. For that reason, automatic gender recognition (AGR) has become an interesting sub-topic in speech recognition systems that can be found in many speech technology applications. However, retrieving salient gender-related information from a speech signal is a challenging problem since speech contains abundant information apart from gender. The paper intends to compare the performance of human vocal tract-based model i.e., linear prediction coefficients (LPC) and human auditory-based model i.e., Mel-frequency cepstral coefficients (MFCC) which are popularly used in other speech recognition tasks by experimentation of optimal feature parameters and classifier’s parameters. The audio data used in this study was obtained from 93 speakers uttering selected words with different vowels. The two feature vectors were tested using two classification algorithms namely, discriminant analysis (DA) and artificial neural network (ANN). Although the experimental results were promising using both feature parameters, the best overall accuracy rate of 97.07% was recorded using MFCC-ANN techniques with almost equal performance for male and female classes. © 2022 Institute of Advanced Engineering and Science. All rights reserved. |
publisher |
Institute of Advanced Engineering and Science |
issn |
25024752 |
language |
English |
format |
Article |
accesstype |
All Open Access; Gold Open Access |
record_format |
scopus |
collection |
Scopus |
_version_ |
1809678022758891520 |