Skip main navigation

Military Health System

Hurricane Milton & Hurricane Helene

Emergency procedures are in place in multiple states due to Hurricane Milton & Hurricane Helene. >>Learn More

Predicting COVID-19 and Respiratory Illness: Results of the 2022–2023 Armed Forces Health Surveillance Division Forecasting Challenge

Image of 2CDC Dr Michael Shaw Doug Jordan MA  201113470. Since 2019, the Integrated Biosurveillance Branch of the Armed Forces Health Surveillance Division has conducted an annual forecasting challenge during influenza season to predict short-term respiratory disease activity among Military Health System beneficiaries.

Since 2019, the Integrated Biosurveillance Branch of the Armed Forces Health Surveillance Division has conducted an annual forecasting challenge during influenza season to predict short-term respiratory disease activity among Military Health System beneficiaries. Weekly case and encounter observed data were used to generate 1- through 4-week advanced forecasts of disease activity. To create unified combinations of model inputs for evaluation across multiple spatial resolutions, eight individual models were used to calculate three ensemble models. The accuracy of forecasts compared to the observed activity for each model was evaluated by calculating a weighted interval score. Weekly 1- through 4-week ahead forecasts for each ensemble model were generally higher than observed data, especially during periods of peak activity, with peaks in forecasted activity occurring later than observed peaks. The larger the forecasting horizon, the more pronounced the gap between forecasted peak and observed peak. The results showed that several models accurately predicted COVID-19 cases and respiratory encounters with enough lead time for public health response by senior leaders.

What are the new findings?

By testing a large number of traditional (e.g., ARIMA, EWMA) and non-traditional (e.g., Random Forest, Count Regression) models, this forecasting study improved understanding of which model types were the most accurate and demonstrated a more robust ensemble prediction. The ensemble models developed by the forecasting challenge provided more accurate forecasts in general, when compared to most individual models.

What is the impact on readiness and force health protection?

Respiratory diseases represent a major impediment to military readiness and force health, including interruptions in duties caused by isolation or quarantine requirements as well as morbidity caused by illnesses themselves. Respiratory disease forecasting is a useful tool for senior leaders’ preparations for illness surges.

Background

Seasonal respiratory infections, including influenza and COVID-19, represent a major impediment to military readiness. Accurate forecasts of the burden of respiratory illness in the Department of Defense population are crucial for allowing military leaders and public health practitioners to anticipate increases in disease activity and implement preventive measures. 

Since 2013, the U.S. Centers for Disease Control and Prevention has conducted an annual influenza forecasting challenge, inviting modelers to submit weekly forecasts of influenza-like illness or confirmed influenza hospitalizations.1 To produce more consistent and reliable forecasts across varying spatial resolutions, forecasting challenges often combine inputs from multiple models into one unified ensemble.2

Since 2019, the Integrated Biosurveillance Branch of the Armed Forces Health Surveillance Division,3 part of the Defense Health Agency’s Public Health Directorate, has conducted its own annual forecasting challenge during the influenza season, modeled after that of the CDC. The goal is to predict short-term (1-4 weeks ahead) respiratory disease activity among Military Health System beneficiaries within collections of geographically-aligned military installations and medical facilities in the U.S. (“markets”) to support timely decision-making by senior leaders. In addition to forecasting disease activity among MHS beneficiaries, AFHSD also forecasts activity among civilians living in counties within 30 miles of a market. This challenge is open to forecasts submitted by government, academic, and industry partners.

During influenza season, AFHSD-IB reports forecast data through weekly biosurveillance products emailed to more than 3,000 individuals. Stakeholders can access these data as needed to inform resource allocation and prevention activities via an interactive dashboard (by Common Access Card only) updated weekly by AFHSDIB.4 This dashboard includes summary information about respiratory illness in each market and DHA network, as well as maps and time series plots of 1- through 4-week ahead forecasts.

This report summarizes the results and lessons from AFHSD’s forecasts for the 2022-2023 forecasting season.

Methods

Influenza seasons were defined as epidemiological weeks 40 through 20 according to CDC’s Morbidity and Mortality Weekly Report epidemiological weeks.5 The 2022-2023 influenza season began on October 2, 2022 and ended May 20, 2023. The 2022-2023 challenge focused on MHS and civilian COVID-19 cases, as well as MHS COVID-like illness, ILI, and COVID-19 outpatient encounters.

Weekly respiratory illness data from multiple sources were downloaded for the 2022-2023 influenza season. MHS COVID-19 cases were collected by AFHSD’s Epidemiology & Analysis Branch using laboratory and reportable medical event data provided by the Defense Centers for Public Health–Portsmouth and DCPH–Aberdeen. The Armed Forces RME Guidelines and Case Definitions document defines 70 DOD RMEs, which closely mirror the nationally notifiable diseases monitored by CDC.6,7 A confirmed case of COVID-19 in MHS beneficiaries was defined using laboratory, clinical, epidemiological, and death certificate data (Unpublished, Supplementary Table 1). Civilian COVID-19 cases, by county, were obtained from HHS Protect and defined according to CDC criteria.8,9 MHS outpatient encounters were extracted from DOD’s Electronic Surveillance System for the Early Notification of Community-based Epidemics (ESSENCE). CLI, ILI, and COVID-19 encounter case definitions were developed internally using International Classification of Diseases, 10th Revision, Clinical Modification diagnosis codes, and are provided in Supplementary Table 1.

Weekly case and encounter observed data were used to generate 1- through 4-week ahead forecasts of disease activity. Forecasts were generated using various models, including time series (including Autoregressive Integrated Moving Average, Error, Trend, Seasonal, Exponentially Weighted Moving Average, and Vector Autoregressive), machine learning (including Random Forest), and count regression (including Poisson, Negative Binomial, and Log-binomial) models. To create unified combinations of model inputs for evaluation across multiple spatial resolutions,10 individual models were used to calculate the 3 ensemble models: 1) the average of the time series and machine learning models–ENSEMBLE, 2) the average of the three best-performing time series and machine learning models–ENSEMBLE_TOP, and 3) the average of the count regression models–ENSEMBLE_CNT.

The accuracy of forecasts compared to the observed activity for each model was evaluated by calculating a weighted interval score (WIS),10 a metric also used by the CDC, that compares performance among models. A lower score indicates better model performance. All analyses were conducted using R software (version 4.1, The R Foundation for Statistical Computing, Vienna, Austria). The R packages “fable,” “randomForest,” and “tscount” were used to generate forecasts and the “evalcast” package to calculate the WIS.11-14

Results

Weekly observed counts of MHS and civilian COVID-19 cases by market were converted to population-adjusted rates, while weekly observed MHS outpatient encounters were converted to a percentage of total outpatient encounters for that week. Weekly 1- through 4-week ahead forecasts for each ensemble model were generally higher than observed data, especially during periods of peak activity (December through February), with peaks in forecasted activity occurring later than observed peaks (Figure 1). The larger the forecasting horizon (i.e., 4 weeks ahead versus 1 week), the more pronounced the gap between forecasted peak and observed peak.

This figure is a compendium of five separate graphs, each of which charts observed as well as forecasted weekly data points that are connected by seven distinct lines along the x-, or horizontal, axis. The intervals along the x axis represent the months from October 2022 through June 2023. In each chart, the forecasted weekly case data from the three ensemble models tested are connected by separate six lines, for one and four week ahead forecasts for each model, while the seventh line connects the weekly observed data recorded. The five separate charts provide the observed and forecasted data for Military Health System COVID-19 cases, civilian COVID-19 cases, Military Health System COVID-19 health care encounters, Military Health System COVID-like illness health care encounters, and Military Health System influenza-like illness health care encounters. The first two charts, Military Health System and civilian COVID-19 cases, provide actual numbers of cases daily per 100,000 individuals, while the other three charts provide data for percentages of encounters for each condition. For Military Health System COVID-19 cases, which remained generally steady, the ENSEMBLE_CNT model’s one week ahead forecast was the most accurate, while the ENSEBLE_TOP four week ahead forecast was the least accurate, predicting large spikes in January and February that did not occur. For civilian COVID-19 cases, the ENSEMBLE_CNT model’s one week ahead forecast was again the most accurate, but the ENSEMBLE four week ahead forecast was the least accurate. For Military Health System COVID-19 encounters, the ENSEMBLE_CNT model’s one week ahead forecast was yet again the most accurate, and the ENSEMBLE four week ahead forecast was the least accurate. There was greater variability, or less accuracy, for Military Health System COVID-like illness encounter model forecasting, but with the exception of the month of January 2023, the ENSEMBLE_TOP model one week ahead forecast was generally the most accurate, and the ENSEMBLE_TOP model four week ahead and ENSEMBLE_CNT model four week ahead forecasts were the least accurate, with the latter consistently underestimating encounters. For Military Health System influenza-like illness encounters, the ENSEMBLE_CNT model one week ahead forecast was generally the most accurate, although all model forecasts were fairly clustered, and under-predictive, starting in mid-February until the end of the study period.

Forecasts of peak MHS COVID-19 case rates were mostly higher than observed, ranging from 44% higher for the ENSEMBLE_CNT model to 457% higher for the ENSEMBLE_TOP model (Table 1). Peak civilian COVID-19 case rate forecasts were more accurate, ranging from 13% lower (ENSEMBLE_CNT) to 99% higher (ENSEMBLE). Peak encounter forecasts for the ENSEMBLE_CNT model were lower than observed peaks (16% and 9% lower for ILI and CLI, respectively) and equal to the observed peak for COVID-19 encounters. Peak encounter forecasts for the ENSEMBLE_TOP model were higher than observed peaks, including 24% higher for ILI, 27% higher for CLI, and 10% higher for COVID-19 encounters. Peak week forecasts tended to be two to six weeks later than observed for most ensemble models and forecast targets. The ENSEMBLE_CNT model accurately predicted forecasts of peak civilian COVID-19 cases and MHS ILI encounters, however.

Click on the table to access a 508-compliant PDF version

Overall, the ENSEMBLE_CNT model had the lowest WIS of all forecasting horizons, indicating the most accurate forecasts for civilian and MHS COVID-19 cases (Figure 2). The ENSEMBLE_TOP model was the most accurate for COVID-19 encounter forecasts, while all three ensemble models performed similarly for CLI and ILI encounters. Model performance decreased as forecast horizons increased, with the median WIS for all 4-week ahead forecasts of the ensemble models increasing between 10% (MHS ILI encounters) and 98% (civilian COVID-19 cases) compared to 1-week ahead forecasts.

This figure is a basic chart of three columns and 25 rows that constitute 75 individually shaded cells. Each column represents one of the three ensemble forecasting models tested in the study. The 25 rows are grouped in five sections, for Military Health System COVID-19 cases, civilian COVID-19 cases, Military Health System influenza-like illness encounters, Military Health System COVID-like encounters, and Military Health System COVID-19 encounters. Each section of rows comprises a row each for four individual forecasts, from one to four weeks ahead, as well as a summary row of all documented cases or encounters. Each shaded cell illustrates an individual Weighted Interval Score for predictive errors for each ensemble model. The lower the score, the better the predictive model performed. The ENSEMBLE_CNT model had the lowest Weighted Interval Score of all forecasting horizons, indicating the most accurate forecasts for the Military Health System and civilian COVID-19 cases. The ENSEMBLE_TOP model had the lowest Weighted Interval Score for COVID-19 encounter models. The model Weighted Interval Score increased as the forecast metric increased for the majority of models and metrics.

Discussion

This is the first published results from the AFHSD Respiratory Forecasting Challenge since it was begun in 2019. Respiratory disease forecasting was more challenging during the 2022-2023 influenza season, due in part to decreased COVID-19 activity compared to prior years and ILI resurgence (Supplementary Table 2). Peak observed MHS and civilian COVID-19 case rates in 2022-2023 were 95% and 91% lower, respectively, compared to the 2021-2022 season, while peak observed MHS COVID-19 and CLI encounters were 76% and 26% lower, respectively, than the prior season. Conversely, peak observed MHS ILI encounters during the 2022-2023 season were 41% higher than during the 2021-2022 season and 111% higher than during the 2020-2021 season. Historical data for the previous two seasons were, therefore, not predictive of respiratory activity in 2022-2023.

Ensemble models generally provided more accurate forecasts, especially the ENSEMBLE_CNT and ENSEMBLE models, compared to most individual models (Supplementary Figure). Although certain individual models outperformed ensemble models for specific forecasting targets, including the Random Forest model for MHS COVID-19 case forecasts and the Poisson model for civilian COVID-19 case forecasts, each performed similarly when compared to the best-performing ensemble model. Model performance decreased as the forecasting horizon increased, with WIS scores ranging from 10% to 95% higher on average for 4-week ahead forecasts compared to 1-week ahead forecasts. These results are consistent with a previous publication of COVID-19 forecasts in the U.S. COVID-19 Forecast Hub that found that an ensemble model comprised of 27 individual models was consistently more accurate than the individual models, and that the accuracy of forecasting models decreased as forecast horizons increased.15

This forecasting study has several strengths. First, the forecasting results showed that several models accurately predicted COVID-19 cases and respiratory encounters with enough lead time for senior leaders to take action. Second, this forecasting study tested a large number of traditional (e.g., ARIMA, EWMA) and non-traditional (e.g., Random Forest, Count Regression) models, increasing our understanding of which types of models were most accurate and providing a more robust ensemble prediction.

The forecasting of the 2022-2023 season also showed several limitations that may have affected model accuracy. COVID-19 cases may have been generally under-reported due to the large number of asymptomatic cases and use of at-home testing, both within DOD and civilian populations. Data reporting schedules, particularly for civilian COVID-19 cases, changed dramatically during the season after the May 11, 2023 end of the U.S. Public Health Emergency for COVID-19. This policy change disrupted county case reporting by CDC.16 Many states and military treatment facilities also changed their COVID-19 case reporting schedules, from daily to weekly, monthly, or not at all. To abridge some of gaps in COVID-19 reporting, health encounter data from DOD ESSENCE could be utilized, but syndromic surveillance systems such as ESSENCE may suffer from inconsistent data quality between reporting sites and gaps in coverage.17 In addition, these data can also lag by at least 4 days from the encounter date, leading to under-reporting of health encounters during the most recent week; these data present challenges for forecasting, as the observed value for this week may change significantly in subsequent weeks. During the 2022-2023 season, reported numbers of civilian and MHS COVID-19 cases for a given week increased by as much as 50% 1 month after an initial reporting date, as older cases were reported, while MHS encounter data ranged from a 40% decrease to a 40% increase as additional encounters populated the system. Efforts were made to account for potential backfill in each market for both case and encounter data prior to generating weekly forecasts, but forecasting analysis can be challenging due to unpredictable data processing schedules. Other limitations included the availability and usefulness of covariate data. Data that previously relied on for COVID-19 forecasting, including vaccination and case data, became less reliable or unavailable during the season.

Another limitation of this study is the relative usefulness and timeliness of the forecasts. As mentioned, forecast accuracy decreased as forecasting horizon increased. The data lags in ESSENCE, compounded by the time constraints of downloading and aggregating weekly data and generating weekly forecasts, meant that weekly forecasts were not available for senior leaders until nearly one week after the most recently observed data. This circumstance renders the 1-week ahead forecasts of disease activity mostly unusable, limiting senior leaders’ response time to 2-week ahead forecasts. Although the 3- and 4-week ahead forecasts provide adequate time for senior leaders to make necessary preparations, their accuracy is greatly diminished compared to 1- and 2-week ahead forecasts. Efforts to improve the utility of 1- and 2-week ahead forecasts may be achieved by downloading data earlier each week and generating weekly forecasts more efficiently, but efforts for improving the more distant horizon forecasts and expanding beyond four weeks are current priorities.

Future AFHSD-IB respiratory forecasting challenges will consider additional covariates, such as environmental data, and combine time series and count regression forecasts into a single ensemble model. The incorporation of new models, such as neural network models, machine learning models, and wavelet forecasting, will also be explored. More emphasis will be placed on non-pandemic seasons to lessen the impacts of changes in COVID-19 and influenza reporting. Forecasting will focus on more consistently available data sources for both DOD and civilian populations, including COVID-19 hospitalizations, influenza hospitalizations, and health encounter data. As the time elapsed since the initial years of the COVID-19 pandemic increases, historical data may become more reliable in predicting the volume and peak activity for COVID-19 and other respiratory diseases during upcoming influenza seasons.

Click on the table to access a 508-compliant PDF version

Click on the table to access a 508-compliant PDF version

This figure is a basic chart of 11 columns and 25 rows that constitute 275 individually shaded cells. Each column represents a forecasting or analysis model utilized in this study, including three ensemble models and eight individual forecasting models. The 25 rows are grouped in five sections, for Military Health System COVID-19 cases, civilian COVID-19 cases, Military Health System influenza-like illness encounters, Military Health System COVID-like encounters, and Military Health System COVID-19 encounters. Each section of rows comprises a row each for four predictive time metrics, from one to four weeks ahead, as well as a summary row of all documented cases or encounters. Each shaded cell illustrates an individual Weighted Interval Score for predictive errors for each model utilized. The lower the score, the better the predictive model performed. The ensemble models generally had lower Weighted Interval Score compared to most of the individual models. However, the Random Forest model for Military Health System COVID-19 cases and the Poisson model for civilian COVID-19 cases, each had a similar Weighted Interval Score to the ENSEMBLE_CNT model.

Authors’ Affiliation

Armed Forces Health Surveillance Division, Integrated Biosurveillance Branch, Silver Spring, MD: Mr. Bova, Dr. McGee, Ms. Elliott, Mr. Ubiera

Acknowledgments

Christian T. Bautista, PhD; Bethany A. Vance, MS; Abagail D. Chepenik, MS; Michael J. Celone, MPH

References

  1. U.S. Centers for Disease Control and Prevention. About Flu Forecasting. Accessed Feb. 29, 2024. https://www.cdc.gov/flu/weekly/flusight/howflu-forecasting.htm 
  2. Mathis SM, Webber AE, León TM, et al. Evaluation of FluSight influenza forecasting in the 2021-22 and 2022-23 seasons with a new target laboratory-confirmed influenza hospitalizations. Preprint. medRxiv. 2023;2023.12.08.23299726. Published Dec. 11, 2023. doi:10.1101/2023.12.08.23299726 
  3. Armed Forces Health Surveillance Division. Health Readiness. Accessed Nov. 30, 2023. https://health.mil/Military-Health-Topics/Health-Readiness/AFHSD  
  4. Military Health System, U.S. Department of Defense. AFHSD-IB Respiratory Markets to Watch Dashboard. Accessed Apr. 3, 2024. https://bitab.health.mil/#/views/Respiratory_M2W_Dashboard/Resp_Activity?:iid=1 
  5. National Notifiable Diseases Surveillance System, U.S. Centers for Disease Control and Prevention. MMWR Weeks. Accessed Feb. 29, 2024. https://ndc.services.cdc.gov/wp-content/uploads/MMWR_Week_overview.pdf   
  6. Armed Forces Health Surveillance Division, Defense Health Agency, U.S. Department of Defense. Armed Forces Reportable Medical Events Guidelines and Case Definitions. Armed Forces Health Surveillance Branch. 2022. Accessed Mar. 7, 2024. https://www.health.mil/Reference-Center/Publications/2022/11/01/Armed-Forces-Reportable-Medical-Events-Guidelines 
  7. National Notifiable Diseases Surveillance System, U.S. Centers for Disease Control and Prevention. Surveillance Case Definitions for Current and Historical Conditions. Accessed Mar. 7, 2024. https://ndc.services.cdc.gov 
  8. U.S. Department of Health and Human Services. HHS Protect. Accessed Nov. 30, 2023. https://protect.hhs.gov 
  9. U.S. Centers for Disease Control and Prevention. Coronavirus Disease 2019 (COVID-19) 2021 Case Definition. Accessed Mar. 7, 2024. https://ndc.services.cdc.gov/case-definitions/coronavirusdisease-2019-2021
  10. Bracher J, Ray EL, Gneiting T, Reich NG. Evaluating epidemic forecasts in an interval format. PLOS Comput Bio. 2021;17(2):e1008618. doi:10.1371/journal.pcbi.1008618   
  11. O'Hara-Wild M, Hyndman R, Wang E. fable: Forecasting Models for Tidy Time Series. R package version 0.3.3. 2023. https://CRAN.R-project.org/package=fable 
  12. Liaw A, Wiener M. Classification and regression by randomForest. R News. 2002;2(3):18-22. https://journal.r-project.org/articles/RN-2002-022/RN-2002-022.pdf   
  13. Liboschik T, Fokianos K, Fried R. tscount: An R package for analysis of count time series following generalized linear models. J Stat Softw. 2017;82(5):1-51. doi:10.18637/jss.v082.i05   
  14. McDonald D, Bien J, O’Brien M, et al. evalcast: tools for evaluating COVID forecasters. 2023. https://cmu-delphi.github.io/covidcast/evalcastR [and] https://github.com/cmu-delphi/covidcast   
  15. Cramer EY, Ray EL, Lopez VK, et al. Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the United States (published correction in Proc Natl Acad Sci USA. 2023;120(15):e2304076120). Proc Natl Acad Sci USA. 2022;119(15):e2113561119. doi:10.1073/pnas.2113561119 
  16. U.S. Centers for Disease Control and Prevention. End of the Federal COVID-19 Public Health Emergency (PHE) Declaration. Updated Sep. 12, 2023. Accessed Mar. 8, 2024. https://archive.cdc.gov/#/details?q=COVID-19&start=0&rows=10&url=https://www.cdc.gov/coronavirus/2019-ncov/your-health/end-of-phe.html 
  17. Thomas MJ, Yoon PW, Collins JM, Davidson AJ, MacKenzie WR. Evaluation of syndromic surveillance systems in 6 US state and local health departments. J Public Health Manag Pract. 2018;24(3):235-240. doi:10.1097/PHH.0000000000000679
Skip subpage navigation
Refine your search
Last Updated: June 06, 2024
Follow us on Instagram Follow us on LinkedIn Follow us on Facebook Follow us on X Follow us on YouTube Sign up on GovDelivery