Leptospirosis modelling using hydrometeorological indices and random forest machine learning

Leptospirosis is a zoonosis that has been linked to hydrometeorological variability. Hydrometeorological averages and extremes have been used before as drivers in the statistical prediction of disease. However, their importance and predictive capacity are still little known. In this study, the use o...

Full description

Bibliographic Details
Published in:International Journal of Biometeorology
Main Author: Jayaramu V.; Zulkafli Z.; De Stercke S.; Buytaert W.; Rahmat F.; Abdul Rahman R.Z.; Ishak A.J.; Tahir W.; Ab Rahman J.; Mohd Fuzi N.M.H.
Format: Article
Language:English
Published: Springer Science and Business Media Deutschland GmbH 2023
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85147099122&doi=10.1007%2fs00484-022-02422-y&partnerID=40&md5=45d6a53b17ce1fe5ee7b4af6a73117ce
id 2-s2.0-85147099122
spelling 2-s2.0-85147099122
Jayaramu V.; Zulkafli Z.; De Stercke S.; Buytaert W.; Rahmat F.; Abdul Rahman R.Z.; Ishak A.J.; Tahir W.; Ab Rahman J.; Mohd Fuzi N.M.H.
Leptospirosis modelling using hydrometeorological indices and random forest machine learning
2023
International Journal of Biometeorology
67
3
10.1007/s00484-022-02422-y
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85147099122&doi=10.1007%2fs00484-022-02422-y&partnerID=40&md5=45d6a53b17ce1fe5ee7b4af6a73117ce
Leptospirosis is a zoonosis that has been linked to hydrometeorological variability. Hydrometeorological averages and extremes have been used before as drivers in the statistical prediction of disease. However, their importance and predictive capacity are still little known. In this study, the use of a random forest classifier was explored to analyze the relative importance of hydrometeorological indices in developing the leptospirosis model and to evaluate the performance of models based on the type of indices used, using case data from three districts in Kelantan, Malaysia, that experience annual monsoonal rainfall and flooding. First, hydrometeorological data including rainfall, streamflow, water level, relative humidity, and temperature were transformed into 164 weekly average and extreme indices in accordance with the Expert Team on Climate Change Detection and Indices (ETCCDI). Then, weekly case occurrences were classified into binary classes “high” and “low” based on an average threshold. Seventeen models based on “average,” “extreme,” and “mixed” indices were trained by optimizing the feature subsets based on the model computed mean decrease Gini (MDG) scores. The variable importance was assessed through cross-correlation analysis and the MDG score. The average and extreme models showed similar prediction accuracy ranges (61.5–76.1% and 72.3–77.0%) while the mixed models showed an improvement (71.7–82.6% prediction accuracy). An extreme model was the most sensitive while an average model was the most specific. The time lag associated with the driving indices agreed with the seasonality of the monsoon. The rainfall variable (extreme) was the most important in classifying the leptospirosis occurrence while streamflow was the least important despite showing higher correlations with leptospirosis. © 2023, The Author(s) under exclusive licence to International Society of Biometeorology.
Springer Science and Business Media Deutschland GmbH
207128
English
Article

author Jayaramu V.; Zulkafli Z.; De Stercke S.; Buytaert W.; Rahmat F.; Abdul Rahman R.Z.; Ishak A.J.; Tahir W.; Ab Rahman J.; Mohd Fuzi N.M.H.
spellingShingle Jayaramu V.; Zulkafli Z.; De Stercke S.; Buytaert W.; Rahmat F.; Abdul Rahman R.Z.; Ishak A.J.; Tahir W.; Ab Rahman J.; Mohd Fuzi N.M.H.
Leptospirosis modelling using hydrometeorological indices and random forest machine learning
author_facet Jayaramu V.; Zulkafli Z.; De Stercke S.; Buytaert W.; Rahmat F.; Abdul Rahman R.Z.; Ishak A.J.; Tahir W.; Ab Rahman J.; Mohd Fuzi N.M.H.
author_sort Jayaramu V.; Zulkafli Z.; De Stercke S.; Buytaert W.; Rahmat F.; Abdul Rahman R.Z.; Ishak A.J.; Tahir W.; Ab Rahman J.; Mohd Fuzi N.M.H.
title Leptospirosis modelling using hydrometeorological indices and random forest machine learning
title_short Leptospirosis modelling using hydrometeorological indices and random forest machine learning
title_full Leptospirosis modelling using hydrometeorological indices and random forest machine learning
title_fullStr Leptospirosis modelling using hydrometeorological indices and random forest machine learning
title_full_unstemmed Leptospirosis modelling using hydrometeorological indices and random forest machine learning
title_sort Leptospirosis modelling using hydrometeorological indices and random forest machine learning
publishDate 2023
container_title International Journal of Biometeorology
container_volume 67
container_issue 3
doi_str_mv 10.1007/s00484-022-02422-y
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85147099122&doi=10.1007%2fs00484-022-02422-y&partnerID=40&md5=45d6a53b17ce1fe5ee7b4af6a73117ce
description Leptospirosis is a zoonosis that has been linked to hydrometeorological variability. Hydrometeorological averages and extremes have been used before as drivers in the statistical prediction of disease. However, their importance and predictive capacity are still little known. In this study, the use of a random forest classifier was explored to analyze the relative importance of hydrometeorological indices in developing the leptospirosis model and to evaluate the performance of models based on the type of indices used, using case data from three districts in Kelantan, Malaysia, that experience annual monsoonal rainfall and flooding. First, hydrometeorological data including rainfall, streamflow, water level, relative humidity, and temperature were transformed into 164 weekly average and extreme indices in accordance with the Expert Team on Climate Change Detection and Indices (ETCCDI). Then, weekly case occurrences were classified into binary classes “high” and “low” based on an average threshold. Seventeen models based on “average,” “extreme,” and “mixed” indices were trained by optimizing the feature subsets based on the model computed mean decrease Gini (MDG) scores. The variable importance was assessed through cross-correlation analysis and the MDG score. The average and extreme models showed similar prediction accuracy ranges (61.5–76.1% and 72.3–77.0%) while the mixed models showed an improvement (71.7–82.6% prediction accuracy). An extreme model was the most sensitive while an average model was the most specific. The time lag associated with the driving indices agreed with the seasonality of the monsoon. The rainfall variable (extreme) was the most important in classifying the leptospirosis occurrence while streamflow was the least important despite showing higher correlations with leptospirosis. © 2023, The Author(s) under exclusive licence to International Society of Biometeorology.
publisher Springer Science and Business Media Deutschland GmbH
issn 207128
language English
format Article
accesstype
record_format scopus
collection Scopus
_version_ 1825722580885045248