Authors:
Yaari R, Galanti M, Zepeda-Tello R, Chicumbe S, Jani I, Cassy A, Macicame I, Manafe N, Farley SM, El-Sadr WM, Shaman J.
Abstract:
Background: Mozambique faces a high burden of infectious diseases but currently has limited capacity for forecasting disease incidence. Recent improvements in disease surveillance through the National Monitoring and Evaluation System now provide weekly reports of disease incidence across the country’s districts. This study focuses on using these records, specifically for malaria and diarrhoeal diseases, which together account for approximately 40% of deaths among children under five, to develop statistical forecasts and evaluate their accuracy.
Methods: We utilised a Python library for time series forecasting called Darts, which includes a variety of statistical forecasting models. Three models were selected for this analysis: Exponential Smoothing (a classical statistical model), Light Gradient Boosting Machine (a machine-learning model), and Neural Hierarchical Interpolation for Time Series (a neural network-based model). Retrospective forecasts were generated and compared across multiple forecast horizons. We evaluated both point and probabilistic forecast accuracy for individual models and two types of model ensembles, comparing the results to forecasts based on historical expectance.
Results: All models consistently outperformed forecasts based on historical expectance for both malaria and diarrhoeal disease across forecast horizons of up to eight weeks, with comparable or better performance at 16 weeks. The most accurate forecasts were achieved using a weighted ensemble of the models.
Conclusions: This study highlights the potential of using a readily available tool for generating accurate disease forecasts. It represents a step toward scalable and accessible forecasting solutions that can enhance disease surveillance and public health responses, not only in Mozambique but also in other low- and middle-income countries with similar challenges.