Brief Report: Forecasting Influenza with the Long Short-Term Memory Model: Results from the 2023-2024 Influenza Season

Image of 6. Since 2019 the Integrated Biosurveillance Branch of the Armed Forces Health Surveillance Division has conducted forecasting activities during influenza season to provide early warning and increased awareness of potential health risks to the Department of Defense.

Timely detection of infectious diseases and health threats is of increasing importance, particularly for U.S. military service members. Existing surveillance systems are hindered, however, by a 1- to 2-week delay between actual disease outbreaks and release of surveillance data.1 To address this challenge, since 2019 the Integrated Biosurveillance Branch of the Armed Forces Health Surveillance Division has conducted forecasting activities during influenza season to provide early warning and increased awareness of potential health risks to the Department of Defense enterprise.2 At the end of each influenza season, IB evaluates the performance of the individual forecasting models and assesses potential integration of new algorithms to improve forecasting capabilities for the next influenza season.

The Long Short-Term Memory model is a machine-learning method with potential to improve forecasting accuracy for respiratory disease surveillance.3 The LSTM model is a recurrent neural network model that can be used in almost all modeling fields. LSTM has the capacity to selectively add new information and forget previously accumulated information. While LSTM models are well-established, their performance in forecasting influenza encounters utilizing DOD surveillance data has not been studied. This report assesses the performance of the LSTM model for possible inclusion in future DOD influenza forecasting analyses.

Methods

Influenza encounters were defined as outpatient visits with an International Classification of Diseases, 10th Revision discharge diagnosis code, with codes J09 through J11 selected and identified for influenza encounters. Outpatient influenza encounter data from Military Health System beneficiaries were collected weekly during the 2023-2024 influenza season from all U.S. military hospitals and clinics. Total outpatient encounter data were obtained from the DOD’s Electronic Surveillance System for the Early Notification of Community-based Epidemics. The percentage of outpatient influenza encounters was calculated as the weekly percentage of total outpatient encounters.

Short-term, 1-2-week forecasts were previously generated by the IB Branch each week during the influenza season for the U.S., including all military hospitals and clinics for 2023 epidemiological week 40 through 2024 EW 20. Forecasts were generated weekly using various time series and machine learning models, including autoregressive integrated moving average, error-trend-seasonality, exponentially weighted moving average, naïve, neural network, poisson, prophet, random forest, time series linear model, and vector autoregressive model. An ensemble model was created as an average of all the forecasting models used.

Short-term, 1-2-week LSTM model forecasts were generated for percentages of MHS influenza encounters for each week of the 2023-2024 influenza season by utilizing training data from the previous influenza season (2022 EW 40 through 2023 EW 20). Forecast horizons, the timeframe for which a forecast is made, were defined for 1 week, 2 weeks, and 1-2 weeks ahead. To validate the model, the data were separated into training and testing sets for each EW of evaluation. Training loss was calculated using mean squared error. Key hyper-parameters including number of hidden units (50), dropout rate (0.2), and an adaptive retrospective period were used to improve model performance.

Weekly forecasts were then compared with observed values from each EW using the weighted interval score4 and absolute percentage error. Scores from the LSTM model were then combined with all previously generated model scores to assess model performance.

All analyses and data processing used R version 4.4.2. LSTM models were created using the “torch” package in R, an opensource machine learning framework based on PyTorch.5

Results

WIS, log-transformed WIS, and APE were calculated for 1,924 total forecasts. The average training loss per evaluation week for the LSTM model was 0.5. Median log-transformed WIS and median APE are shown in the Table for each model as well as 1-week, 2-week, and combined 1-2-week forecasts. The LSTM model had the lowest median log-transformed WIS for all forecasting horizons: 1 week (0.3), 2 weeks (0.4), and combined 1-2 weeks (0.4). The VAR model had the lowest median APE for all forecasting horizons (37.5%). Figure 1a presents forecasts with 95% confidence interval bands for the LSTM and ENSEMBLE models over the study period. During 2023 EWs 51 and 52, observed influenza encounter percentages peaked at 0.5% and 0.8%, respectively. The LSTM and ENSEMBLE models under-predicted values, however, with estimates ranging from 0.17% to 0.2% during this period. Figure 1b displays a grouped boxplot of log WIS for each forecast target for all models, ranked by median log WIS. The LSTM model had the lowest log WIS, while the POISSON model had the highest.

FIGURE 1a. Influenza Encounter Percentage by Forecast Target, Military Health System, November 2023–June 2024. This figure is composed of two graphs, each of which charts observed as well as forecasted weekly data, with one graph presenting data for one week in advance, or ahead, forecasts and the other presenting data for two week advance, or ahead, forecasts. Each graph presents a series of data points connected by three different lines along the horizontal, or x-, axis, with two lines in each graph representing a different forecasting model, and the third line in each graph plotting observed data for the same time periods. The intervals along the x axis represent the months from October 2023 through June 2024 in both graphs. In each chart, each line connects 32 data points, each representing a distinct week. The vertical, or y-, axis measures encounter percentages and is divided into units of .25, from 0.00 to 0.75. Corresponding shaded areas around the lines representing the forecasting models represent 95 percent confidence intervals for those forecasts. In each graph, both models lagged behind the greatest spike in the observed data, by a week, and both under-estimated it by nearly one third. The confidence interval for the LSTM model was significantly more precise than the confidence interval for the ENSEMBLE model.FIGURE 1b. Weighted Interval Score by Forecast Target.  This figure displays two grouped boxplot charts showing the distribution of log-transformed weighted interval score (log WIS) for 10 different forecasting models, one for 1-week-ahead and the other 2-week-ahead forecasts, ranked by increasing median log WIS from left to right, indicating decreasing forecast accuracy across the models. In the 1-week-ahead boxplot, LSTM has shorter box and whiskers than the other models, indicating that the model has higher prediction accuracy and lower uncertainty. On the other hand, ARIMA has a shorter median line to minimum than other models, but its box and whiskers are longer, which means the range of WIS values is wider, indicating lower accuracy and greater uncertainty. All models except EWMA and PROPHET show that data values tend to cluster around a central point. The box plot for 2-week shows similar results to the 1-week-ahead boxplot, showing that LSTM has a shorter box and ARIMA has a longer box. However, except for the VAR and PROPHET models, the median lines inside the boxes positioned close to the top edge of the box indicating that most models have skewed distribution.

Discussion

Our analyses indicate that LSTM had the lowest log WIS among the individual models for all forecasting horizons, resulting in more accurate forecasts. These findings align with previous studies that successfully used LSTM models to forecast influenza-like illness and influenza hospitalizations.6,7 Neither the LSTM nor ENSEMBLE models accurately predicted the peak period, 2023 EWs 51-52 (December 17-30), however. This could be due to the utilization of 2022-2023 influenza season data for the training data, as recent seasonal influenza patterns have exhibited significantly higher peaks earlier in the season compared to influenza seasons prior to the COVID-19 pandemic.8,9 To improve influenza peak period forecasts, training data may need to include multiple years, before and after the COVID-19 pandemic, as part of further analysis.

This study had some limitations. First, this study did not employ a formal cross-validation method to optimize hyper-parameters and construct the best-performing LSTM model, which may have contributed to poor predictions, particularly in the early weeks of the study period. Further research is needed to optimize the LSTM model for influenza encounter predictions. Second, some WIS values were found to be zero, indicating that the estimated value was an exact match to the observed value. Scores equal to zero should be interpreted with caution, as those values may be due to overconfidence and result in an undefined log-transformed WIS.10 Consequently, WIS values equal to 0 were excluded from the calculation of log-transformed WIS, but this may have introduced bias by excluding forecasts that were very close to actual values. Third, it is not possible to state with confidence that these results are generalizable to other respiratory diseases or related metrics such as hospitalizations, admission rates, or case rates. Lastly, this analysis does not reflect changes after the 2023-2024 influenza season to improve forecasting, such as the removal of the ETS, EWMA, PROPHET, and TSLM models. Although the LSTM model outperformed several models included in the ENSEMBLE model, it is likely the ENSEMBLE model will perform better for the 2024-2025 influenza season. 

The findings of this study demonstrate that the addition of the LSTM model improves the short-term forecasting performance of the ENSEMBLE model for outpatient influenza encounter data, which is commonly used to assess the activity intensity of this respiratory disease within the MHS population. Further research is recommended to determine the performance of the LSTM model for other respiratory infections, including COVID-19.

Authors’ Affiliation

Armed Forces Health Surveillance Division, Integrated Biosurveillance Branch, Silver Spring, MD: Ms. Cherukuri, Mr. Bova, Ms. Mehta, Dr. Bautista

References

  1. Jang B, Kim I, Kim JW. Effective training data extraction method to improve influenza outbreak prediction from online news articles: deep learning model study. JMIR Med Inform. 2021;9(5):e23305. doi:10.2196/23305 
  2. Armed Forces Health Surveillance Division. Integrated Biosurveillance. Defense Health Agency, U.S. Dept. of Defense. Accessed Jan 3., 2025. https://health.mil/military-health-topics/health-readiness/afhsd/integrated-biosurveillance 
  3. Dai S, Han L. Influenza surveillance with Baidu index and attention-based long short-term memory model. PLoS One. 2023;18(1):e0280834. doi:10.1371/journal.pone.0280834   
  4. Torch for R. Mlverse.org. Accessed Jan 13, 2025. https://torch.mlverse.org 
  5. Bracher J, Ray EL, Gneiting T, Reich NG. Evaluating epidemic forecasts in an interval format [published correction in PLoS Comput Biol. 2022;18(10):e1010592. doi:10.1371/journal.pcbi.1010592]. PLoS Comput Biol. 2021;17(2):e1008618. doi:10.1371/journal.pcbi.1008618 
  6. Tsan YT, Chen DY, Liu PY, et al. The prediction of influenza-like illness and respiratory disease using LSTM and ARIMA. Int J Environ Res Public Health. 2022;19(3):1858. doi:10.3390/ijerph19031858 
  7. Li G, Li Y, Han G, et al. Forecasting and analyzing influenza activity in Hebei province, China, using a CNN-LSTM hybrid model. BMC Public Health. 2024;24(1):2171. doi:10.1186/s12889-024-19590-8 
  8. Del Riccio M, Caini S, Bonaccorsi G, et al. Global analysis of respiratory viral circulation and timing of epidemics in the pre-COVID-19 and COVID-19 pandemic eras, based on data from the Global Influenza Surveillance and Response System (GISRS). Int J Infect Dis. 2024;144:107052. doi:10.1016/j.ijid.2024.107052 
  9. Lewis T. Why this year’s flu season is the worst in more than a decade. Scientific American. [published online.] Mar. 3, 2025. Accessed Mar 11, 2025. https://www.scientificamerican.com/article/why-this-years-flu-season-is-the-worst-in-more-than-a-decade 
  10. Bosse NI, Abbott S, Cori A, et al. Scoring epidemiological forecasts on transformed scales. PLoS Comput Biol. 2023;19(8):e1011393. doi:10.1371/journal.pcbi.1011393

You also may be interested in...

Article
Jan. 1, 2024

Brief Report: The Four Most Frequently Diagnosed Vector-borne Diseases Among Service Member and Non-Service Member Beneficiaries in the Geographic Combatant Commands, 2010–2022

This report provides linear trends of selected vector-borne diseases, over a 13-year surveillance period, among Armed Forces service and non-service member beneficiaries diagnosed at installations within the Northern Command (NORTHCOM), Africa Command (AFRICOM), Central Command (CENTCOM), European Command (EUCOM), Indo-Pacific Command (INDOPACOM), or ...

Article
Jan. 1, 2024

Ivermectin Prescription Fill Rates Among U.S. Military Members During the Coronavirus Disease 2019 (COVID-19) Pandemic

This report describes ivermectin prescription fill rates among U.S. active component service members over time during the early phases of the COVID-19 pandemic. Ivermectin prescription fill rates increased among active component service members early in the COVID-19 pandemic when misinformation about the effectiveness of ivermectin for prevention and ...

Article
Jan. 1, 2024

Reportable Medical Events at Military Health System Facilities Through Week 48, Ending November 30, 2023

Each month the MSMR publishes an update of reportable medical events documented in the Disease Reporting System internet by health care providers and public health officials throughout the Military Health System, for monitoring, controlling, and preventing the occurrence and spread of diseases of public health interest or readiness importance.

Article
Dec. 1, 2023

Reportable Medical Events at Military Health System Facilities Through Week 44, Ending November 4, 2023

Chlamydia, by far the most frequently reported medical event (RME) with the MHS, declined by 17% in October, to 1,190 cases, from 1,437 cases (adjusted) in September; this follows a 15% decline from August to September. Gonorrhea, the second highest RME, declined 15% in October, to 192 cases reported cases in September, from 225 cases (adjusted). ...

Report
Oct. 1, 2023

MSMR Vol. 30 No. 10 - October 2023

.PDF | 1.29 MB

The October 2023 Medical Surveillance Monthly Report (MSMR) provides a review of the incidence of colorectal cancer among active component service members from 2010 to 2022; followed by a study of force protection risks in AFRICOM, INDOPACOM, and SOUTHCOM due to rapid diagnostic test failures for P. falciparum malaria from 2016 to 2022; then an update ...

Refine your search