key: cord-1001081-10mbsqmo authors: Xu, Stanley; Clarke, Christina; Shetterly, Susan; Narwaney, Komal title: Estimating the Growth Rate and Doubling Time for Short-Term Prediction and Monitoring Trend During the COVID-19 Pandemic with a SAS Macro date: 2020-04-11 journal: nan DOI: 10.1101/2020.04.08.20057943 sha: 3dead804aa9a4d23f88bb79504d04fdc11f2fa10 doc_id: 1001081 cord_uid: 10mbsqmo Coronavirus disease (COVID-19) has spread around the world and it causes tremendous stress to the US health care system. Knowing the trend of the COVID-19 pandemic is critical for the federal and local governments and health care system to prepare plans. Our aim was to develop an approach and create a SAS macro to estimate the growth rate and doubling time in days. We fit a series of growth curves using a rolling approach to estimate the growth rates and the doubling times. This approach was applied to the death data of New York State during March 14th and 31st. The growth rate was 0.48 (95% CI, 0.39-0.57) and the doubling time was 2.77 days (95% CI, 2.49-3.04) for the period of March 14th-March 20th; the growth rate decreased to 0.25 (95% CI, 0.22-0.28) and the doubling time increased to 4.09 days (95% CI, 3.73-4.44) for the period of March 25th-March 31st. This approach can be used for short-term prediction and monitoring the trend of the COVID-19 pandemic. In December 2019, an outbreak of coronavirus disease (COVID-19) caused by the novel coronavirus (SARS-CoV-2) began in Wuhan, China and has now spread across the world [1, 2] . In the United States, the cumulative number of identified COVID-19 cases was 186,101 as of March 31st, 2020; among the identified cases, 3603 died [3] . To slow the spread of COVID-19, federal and local governments issued mitigation measures such as case isolation, quarantine, school closures and closing non-essential businesses. The COVID-19 pandemic imposes tremendous challenges to the US health care system, particularly given concerns that the need for hospital beds and ICU beds could exceed capacity [4] [5] [6] . Predicting the future numbers of COVID-19 cases and healthcare utilization is critical for governments and health care systems preparation plans [4, 6, 7] . Two useful and critical quantities for prediction are the growth rate [8] and the doubling time of number of events [9] . The growth rate is the percent change of daily events (e.g, COVID-19 cases, number of patients hospitalized or number of deaths). The doubling time is the length of time required to double the number of daily events. Our goal was to develop an approach and create a SAS macro using observed data to estimate the growth rate and doubling time in days for short-term prediction. In the United States, there were several barriers for testing people for COVID-19 such as shortages of swabs and testing kits and restrictions on who should get tested. Therefore, the number of COVID-19 cases was often under-identified and underreported. However, the number of hospitalized COVID-19 patients and number of deaths due to COVID-19 were more reliable than the reported number of COVID-19 cases [10] . In this paper, we used the number of daily deaths to calculate the growth rate and doubling time in days. We assumed a growth curve of daily deaths over a period of ݊ days from day ‫ݐ‬ (start day) to day ‫ݐ(‬ ݊ െ 1 We fit two models: a) using equation (1) is the estimated growth rate from the last period. As the growth rate changes over time, the prediction is only appropriate for short-term prediction (e.g., within 7 days) and updated growth rates should be used. , the predicted numbers of daily deaths for April 1 st and 2 nd were 468 and 586, respectively. The observed number of deaths in New York State was 498 on April 1 st . SAS programs are available for conducting these analyses (Appendix A and Appendix B). These models can be similarly applied to hospitalization data if those data are available. When COVID-19 testing is widely available to the public and the number of COVID-19 testing is less selective, these models can also be used to directly estimate the growth rate and the doubling time for COVID-19 cases. Due to a lag in reporting death, it is recommended to exclude the recent 1-2 days' death data in fitting the growth curves. This paper illustrates that death data can be used to estimate the growth rate and doubling time to aid predicting future deaths, hospitalizations and COVID-19 cases. Because a series of growth curves were fit, the RGCA approach can also be used for real-time monitoring of the epidemic trend as shown in Figure 1 . . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.08.20057943 doi: medRxiv preprint Figure 1 . Estimated growth rate with 95% CIs over time using death data from New York State. . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.08.20057943 doi: medRxiv preprint APPENDIX A /************************************************************* * Title: title 'Growth rate and doubling time for each interval'; proc print data = r_doubling_time; var start_day end_day . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . endloop = &fup_end-&int_length+1; format start endloop date9.; put "Looping through the starting date and the last date -&int_length +1 days:"; put start = endloop =; run; %do start_day = &fup_start %to (&fup_end-&int_length+1); /*Hold onto the current start day to append to some of the final datasets so they do not get overwritten*/ %let stdyfmt = %sysfunc(putn(&start_day,date9.)); /*Using the input dataset, calculate the last day that will be considered in these calculations for each interval. The date of death needs to be between the start and ending day*/ data CGR_dat01; set &indat; end_day = &start_day + &int_length -1; if &start_day <= &dateofevent <= end_day; format end_day date9.; proc sort; by &dateofevent; run; /*This step will retain the number of deaths from the first date of the current interval through each date deaths were reported.*/ data CGR_dat02; set CGR_dat01; by &dateofevent; retain start_new_event; if _N_=1 then start_new_event = &numevents; run; /*This model will estimate r, the growth rate. Create one dataset for each iteration of the start day.*/ proc nlin data=CGR_dat02 list noitprint; parms r 0.75; model &numevents = start_new_event*((1+r)**(&dateofevent -&start_day)); output out = preddeath_start_&stdyfmt predicted = Pred lclm = Lower95 uclm = Upper95; ods output ParameterEstimates = r_Estimates; run; quit; data r_macro_var; set r_Estimates; call symput('r_macro_var',estimate); run; . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint /*Print the resulting data*/ %do start_day = &fup_start %to (&fup_end-&int_length+1); %let stdyfmt = %sysfunc(putn(&start_day,date9.)); TITLE "Observed and predicted events and 95% CI for the interval beginning on &stdyfmt"; proc print data=preddeath_start_&stdyfmt noobs; var &dateofevent &numevents pred lower95 upper95; format &dateofevent mmddyy10.; run; %end; *Now look at the short-term future predictions based on the last date of deaths from the incoming dataset. The corresponding number of deaths will also be used.; /*First, get the number of deaths/events on the last day (k) of the last period for estimating the growth rate*/ proc sql noprint; select distinct &numevents into :y_k from &indat where &dateofevent = &fup_end ; quit; . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.08.20057943 doi: medRxiv preprint /*Next, get r_k from the doubling time, which is the estimated growth rate from the last period*/ proc sql noprint; select distinct r ,r_lowerCL ,r_upperCL into :r_k, :r_k_lower, :r_k_upper from r_doubling_time where end_day = &fup_end ; quit; %put &fup_end &y_k &r_k &r_k_lower &r_k_upper; data prediction; k=&fup_end; do i=1 to &int_length; m = k + i; y_m = round(&y_k*((1+&r_k))**(m-k)); y_m_lowerCL = round(&y_k*((1+&r_k_lower))**(m-k)); y_m_upperCL = round(&y_k*((1+&r_k_upper))**(m-k)); output; end; format m date9.; keep m y_m y_m_lowerCL y_m_upperCL; run; title "Predicted number of deaths for the next &int_length days"; proc print data=prediction noobs; run; %mend Calc_GrowthRates; . CC-BY-NC-ND 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.08.20057943 doi: medRxiv preprint Novel Coronavirus Director-General's opening remarks at the media briefing on COVID-19 -11 CDC IHME COVID-19 health service utilization forecasting team. Forecasting COVID-19 impact on hospital bed-days, ICU-days, ventilator days and deaths by US state in the next 4 months Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand American hospital capacity and projected need for COVID-19 patient care. Health Aff (Millwood) Forecasting the novel coronavirus COVID-19 Ancel Meyers L. Serial interval of COVID-19 among publicly reported confirmed cases. Emerg Infect Dis Visualising the doubling time of COVID-19 allows comparison of the success of containment measures SAS Institute, version 9 This research was supported by the Institute for Health Research, Kaiser Permanente Colorado. Xu was also supported by NIH/NCRR Colorado CTSI Grant Number UL1 RR025780. /************************************************************* * Title: * Programmer: Stanley Xu and Christina Clarke Institute for Health Research * Kaiser Permanente Colorado * * Date Created: 4/3/2020 * Description: This macro is designed to calculate a predicted * growth and doubling time of a disease given observed * data. In particular, these models were based on observed * deaths since the true denominator is often unknown given * testing may not be done on all symptomatic or asymtomatic * individuals. Further, hospitalizations could be used if they * are known. * * Input: indat = input dataset with the number of deaths and date of those deaths during a date range * that is to be modeled. * dateofevent = variable name of te date the deaths occurred from the indat dataset * numevents = variable name that has the number of deaths that occurred on each date of death * from the indat dataset * int_length -number of days in each interval -our * example examined 7 day intervals to create piece-wise growth intervals * * * Output: * * References: * * Changes to the code log : * Date Programmer Description *-------------------------------------------------------------* 4/3/2020 cclarke CH001 remove the state variable option **************************************************************/ %macro Calc_GrowthRates(indat, dateofevent, numevents, int_length); *First, we need to get the start and end dates from the input dataset.; proc sql noprint; select distinct min(&dateofevent) ,max(&dateofevent) into :fup_start ,:fup_end from &indat ; quit; /*For QA -Prints the first and last date found in the input data file which will appear in the log*/ data _null_; start = &fup_start;