key: cord-0812927-526elsrf
authors: Xu, R.; Rahmandad, H.; Gupta, M.; DiGennaro, C.; Ghaffarzadegan, N.; Jalali, M. S.
title: Weather Conditions and COVID-19 Transmission: Estimates and Projections
date: 2020-05-08
journal: nan
DOI: 10.1101/2020.05.05.20092627
sha: bc5cd0b017d598835d0da28eb9416fed092287f7
doc_id: 812927
cord_uid: 526elsrf

Background: Understanding and projecting the spread of COVID-19 requires reliable estimates of how weather components are associated with the transmission of the virus. Prior research on this topic has been inconclusive. Identifying key challenges to reliable estimation of weather impact on transmission we study this question using one of the largest assembled databases of COVID-19 infections and weather. Methods: We assemble a dataset that includes virus transmission and weather data across 3,739 locations from December 12, 2019 to April 22, 2020. Using simulation, we identify key challenges to reliable estimation of weather impacts on transmission, design a statistical method to overcome these challenges, and validate it in a blinded simulation study. Using this method and controlling for location-specific response trends we estimate how different weather variables are associated with the reproduction number for COVID-19. We then use the estimates to project the relative weather-related risk of COVID-19 transmission across the world and in large cities. Results: We show that the delay between exposure and detection of infection complicates the estimation of weather impact on COVID-19 transmission, potentially explaining significant variability in results to-date. Correcting for that distributed delay and offering conservative estimates, we find a negative relationship between temperatures above 25 degrees Celsius and estimated reproduction number ([R]), with each degree Celsius associated with a 3.1% (95% CI, 1.5% to 4.8%) reduction in [R]. Higher levels of relative humidity strengthen the negative effect of temperature above 25 degrees. Moreover, one millibar of additional pressure increases [R] by approximately 0.8 percent (95% CI, 0.6% to 1%) at the median pressure (1016 millibars) in our sample. We also find significant positive effects for wind speed, precipitation, and diurnal temperature on [R]. Sensitivity analysis and simulations show that results are robust to multiple assumptions. Despite conservative estimates, weather effects are associated with a 43% change in [R] between the 5th and 95th percentile of weather conditions in our sample. Conclusions: These results provide evidence for the relationship between several weather variables and the spread of COVID-19. However, the (conservatively) estimated relationships are not strong enough to seasonally control the epidemic in most locations.

The COVID-19 pandemic has significantly challenged the global community. High-stakes policy decisions require projections of the course of the pandemic across different geographic regions. Thus, it is critical to know how weather conditions impact the transmission of the disease [1] . Given that many related viral infections such as seasonal flu [2, 3] , MERS [4] [5] [6] , and SARS [7] show notable seasonality, one may expect the transmission of SARS-CoV-2 virus to be similarly dependent on weather conditions. Earlier works indicate that temperature, humidity, air pressure, ultraviolet light exposure, and precipitation potentially impact the spread of COVID-19 through changing the survival times of the virus on surfaces and in droplets, moderating the distances the virus may travel through air, and impacting individual activity patterns and immune responses [8] [9] [10] [11] [12] .

Yet, there is limited agreement on the shape and magnitude of those relationships. While many studies find a correlation between variations in temperature [13, 14] , relative and absolute humidity [15] [16] [17] [18] [19] [20] [21] [22] [23] , ultraviolet light [24] , and wind speed, visibility, and precipitation [25] [26] [27] with measures of pandemic severity [28] , other works [24, 29-31] indicate weaker, inconsistent, or no relationships. A recent review finds inconclusive evidence for the relevance of weather in the transmission of COVID-19 [1] .

It is unclear what explains the inconclusive results. Earlier samples based on datasets focused on China may be too narrow [16, 19, 24, 32] . Others have studied only a subset of weather components that complicate comparisons [18, 19, 33] . Most have not controlled for other important correlates such as government and public responses, population density, and cultural practices [15-17, 24-26, 32, 34] . Another especially under-studied factor is the delays between infection and official recording of cases. These delays, estimated to be close to 10 days [35, 36] , confound attempts to associate daily weather conditions with recorded new cases. Extending time windows over which transmission trends are calculated may partially address this challenge, but significantly reduces the number of data points, complicates interpretation and projections, and increases the risk that results are driven by spurious correlation between weather and location specific factors and trends. Therefore, lack of corrections for these delays may partially explain the inconsistent and inconclusive findings to date. This paper assembles one of the most comprehensive datasets of the global spread of COVID-19 pandemic until late April 2020, builds and validates a statistical method for the estimation of reproduction number controlling for detection delay, location-specific density, and time variant responses, estimates the association of weather conditions and the reproductive number of COVID-19, and provides year-round, global projections.

To track infections, we use the official case reports from various countries. Our starting point is the data collected and compiled by the Johns Hopkins Center for Systems Science [37] . We augment these data with those coming from the Chinese Center for Disease Control and Prevention, Provincial Health Commissions in China, and Iran's state-level reports. The data from the beginning of the epidemic (December 12, 2019) to April 22, 2020 are used for our analysis. We assemble disaggregate data on the spread of COVID-19 in Australia (8 states), Canada (10 states), China (34 province-level administrative units and also 301 individual cities), Iran (31 states), and the United States (3144 counties and 5 territories) and use country level aggregates for the rest of the world. Overall, our dataset includes cumulative infection data for 3739 distinct locations.

We use the Historical Weather database (World Weather Online and OpenWeather Ltd., 2020) to compile weather data. For each location for which we have infection data, we use the weather data for the longitude and latitude of the centroid of that location. We collected minimum and maximum daily temperature, humidity, precipitation, snowfall, moon illumination, sun hour, ultraviolet index, cloud cover, wind speed and direction, and pressure data; however, only a subset of these variables proved relevant in our analysis. We used population density data from Demographia (Cox., W, Demographia, The Public Purpose), the United States Census (U.S. Census Bureau, data.census.gov/cedsci), the Iran Statistical Centre, the United Nation's Projections, City Population (citypopulation.de), and official published estimates for countries not covered by these sources. We also projected contagion risk forward in our simulations for the list of highly-populated cities.

A critical parameter in understanding the spread of an epidemic is the basic reproduction number, R0, which reflects the number of secondary cases generated by an index patient in a fully susceptible population. An epidemic is expected when R0 is above 1 and will die out with R0 values below 1.

Measuring 0 directly requires data that are often not available at scale [38] . However, early in the epidemic before the stock of susceptible populations is depleted, 0 is close to the reproduction number, , which measures the number of secondary cases from each infected case. Reproduction number can be approximated (̂) based on the number of new infections (IM) per currently infected individual, multiplied by the duration of illness (τ) (Equation 1). Actual new infections on any day (IN) are not directly observable and should be estimated. Data on measured daily infections (IM) lags actual new infections by both the incubation period and the delay between onset of symptoms and testing and recording of a case. We use published measures to quantify the distribution of incubation period (averaged between 5 to 6 days [25, 35, 36, 39] ) and onset-to-detection delay (ranging between 4 to 6 days [35, 36] ) to quantify the overall detection delay. Given the variance in detection delay, a simple shift of measured infection by the mean delay (about 10 days) offers an unreliable estimate for true infections (See Section S3 and S5.2.2.2 in Supplementary Document). We therefore use an optimization to find the daily estimated new infections (̂) that are consistent with the observed measured infections (IM) and the overall detection delay distribution (See Section S3 for details). 

We use the ̂( ) as our dependent variable. This formula is robust to existence of asymptomatic cases and under-counting (which is likely due to imperfect test coverage) as long as the changes in test coverage are not correlated with weather conditions 10 days ago (see Sections S3 and S5.2.2.3 in Supplementary Document). We use a delay of τ=20 days from exposure to resolution and results are robust to other durations of illness (see Section S4.1 in Supplementary Document). For each location we only include days with ̂ values above one. ̂ values proceed actual detection, and thus the reliability of early values for each location is affected by irregularities in early testing. Moreover, an unbiased estimate for ̂( ) requires τ days of prior new infection estimates. Thus, to ensure robustness we separately exclude the first 20 days after the ̂ reaches one in each location. Robustness to these exclusion criteria and exclusion of outliers based on ̂( ) values are discussed in Section S4.2 and S4.3 in Supplementary Document.

Prior research [15, 23, 40] suggests various weather-related factors may have a role in the survival of the virus on surfaces, spread of droplets containing the virus, as well as behaviors and responses of human hosts relevant for understanding reproduction number. We therefore include the following daily variables as predictors: temperature (mean ( ̅ ) and diurnal temperature (difference between maximum and minimum daily temperature; ∆ ), both in Celsius), relative humidity (H) (percentage), pressure (P) (millibars), precipitation (C) (millimeters), snowfall (S) (centimeter), wind speed (W) (km/hour), and number of hours of sun received per day (U). We also explore a few interactions among these variables.

Different locations vary in their reproduction number due to various location-specific factors. While location fixed effects can control for many of those variations, our simulation studies show that fixed effects, combined with measurement error inherent in inferring ̂ from reported infections, may significantly attenuate and bias weather effects. Empirically we do not find such sensitivity (See supplement S4.2) and results are consistent with or without fixed effects. Nevertheless, we report fixed effect results in the supplement and in our preferred specification control for log-transformed population density. Moreover, public gathering bans, school closures, physical distancing, and other responses to the COVID-19 outbreak endogenously changes in each location and could confound our findings. To account for these changes, we include a location-specific linear trend in predicting ̂( ).

Given the large variations in ̂ values estimated in this method, we use a log transformation of ̂( ) and linear models to predict (ln(̂)). Thus, the exponential of each estimated effect is the multiplier changing ̂ around its base-value indicated by the location-specific density, policies and responses. We designed and validated our statistical model for estimating ln(̂) by testing its ability to identify true parameters in synthetic data. Specifically, we built a stochastic simulation model of COVID-19 epidemic, generated synthetic infection data using historical weather inputs and known impact functions, and fine-tuned the statistical model until it could reliability identify true effects in a large set of simulated epidemics. We found that: a) Whereas given actual infections ( ), our method, with or without fixed effects, would identify the true functional form relating temperature to transmission rate, estimates become less reliable and significantly attenuated when true infections are inferred rather than exact; b) Our method of estimating true infections (see Sections S3 and S5.2.2.2 in Supplementary Document) offers significantly better results than simple shifting of official infection counts; and c) The most reliable results in synthetic epidemics are found when location-specific trends (but not fixed effects) are used in model specification.

Separately, authors NG and MG created a more realistic individual-based model of disease transmission and used that to generate a separate test dataset with synthetic epidemics. Three scenarios used actual temperatures from a sample of 100 regions and three different functions for temperature effect (including the placebo scenario of no effect). Then author RX, who was blinded to the true specification of temperature effect in the latter synthetic dataset, was able to approximate the true function with correct qualitative shape using the refined statistical method. Details on this test are reported in the Section S5.3 in Supplementary Document.

In short, the preferred specification includes location-specific trends, excludes days with ̂< 1, and the first 20 days after ̂ exceeds one for the first time, and includes the following main and interaction effects: linear spline effect of average temperature with the knot at 25 degrees (see Section S4.5 for alternative knot values), diurnal temperature, relative humidity, quadratic effect of pressure ( transformed: X -1000), precipitation (transformed: ln(X+1)), snowfall (transformed: ln(X+1)), wind speed (transformed: ln(X+1)), number of hours of sun received per day, and interaction between relative humidity and average temperature.

While more complex specifications including nonlinear models could be used, we opted for a simpler and more theoretically-driven alternative for two reasons. First, location-specific trends account for much of the predictive power of the model, and thus, fine-tuning weather terms using cross-validation does not offer significant improvements in predictive power. Second, effects from more complex models would be harder to interpret and communicate, and less reliable to extrapolate.

After calculating a weather response function using existing data, we project the impact of weather conditions on the relative future risk of pandemic for all locations in our sample as well as the major urban areas of the world (those with more than 0.5 million population as of 2017; a total of 1072 cities) that constitute about 30% of the world population. A summary of results is provided in the main paper and the online appendix and an interactive online platform offer details. Table 1 reports our main regression results. This statistical model closely tracks ln(̂) values (R 2 =.534). Much of the accuracy is due to location-specific trends. As expected, (log) population density is strongly associated with ln(̂) (β=.179, 95% CI, .159 to .199). Most locations show rapid reductions in reproduction number captured in the location-specific trends. Specifically, on average ̂ goes down by 4.2% (SD=2.8%) per day since estimated number of new infections exceeds 1 for the first time in each location. Mean temperature ( ̅ ), diurnal temperature (∆ ), air pressure (P), wind speed (W), snowfall (S) and precipitation (C) are significant predictors of transmission.

Effect of mean daily temperature ( ̅ ) is best characterized within two regimes, below and above 25 degrees Celsius, and in interaction with humidity. Temperatures higher than 25 are associated with lower transmission rates (by 3.1% (95% CI, 1.5% to 4.8%) per degree Celsius; excluding interaction with H) while those below that threshold have a smaller impact (0.5% (95% CI, 0.29% to 0.77%) reduction in transmission rates per degree Celsius). The negative effect of . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint temperatures above 25 degrees is strengthened at higher levels of relative humidity-10% increase in relative humidity is associated with an additional 1.2% (95% CI, 0.6% to 1.7%) decrease in transmission rates for each one degree increase in temperature above 25 degrees. Figure 1 -A provides a graphical summary of the joint estimated relationship.

Air pressure (P) has a weak U-shaped effect on the reproduction number, with minimum around 1,004 millibars. One millibar of pressure increases the estimated reproduction number by approximately 0.8% (95% CI, 0.6% to 1%) at the median pressure (1016 millibars) in our sample. We also find weak but significant positive effects of diurnal temperature, precipitation, snowfall and wind speed on reproduction number. With one standard deviation increase in diurnal temperature (∆ ), log-transformed precipitation (P), log-transformed snowfall (S) and log-transformed wind speed (W), the estimated reproduction number increases by 1.9% (95% CI: 0.6% to 3.1%), 2.9% (95% CI: 1.6% to 4.2%), 1.4% (95% CI: 0.3% to 2.5%), and 3.8% (95% CI, 2.5% to 5.1%) respectively.

Overall, the association of various weather variables with COVID-19 transmission is moderate and potentially relevant to assessing the risk of contagion in different locations across the globe. Figure 1 -B provides a histogram of the variations in relative ̂ associated with the combined set of weather variables in our estimation sample. The ratio between any two values can be interpreted as the ratio in ̂ due to differences in weather, all else equal. The gap between the 5 th and 95 th percentile in this distribution indicates a change in relative ̂ by 43%. Given that the typical reproduction number estimated for COVID-19 is in the range of 2 to 3 [41, 42], estimated weather effects alone, may not provide a path to containing the epidemic in most locations, but could notably impact the relative transmission rates. 

Validation of our statistical method using synthetic data showed that our method is capable of identifying the correct sign and shape for the impact of weather variables and that those estimates are potentially conservative (i.e., downplaying the true impacts We also conducted six different tests to assess the robustness of our finding. First, our results do not change with the use of different illness durations to calculate ̂ (Table S1 in section S4.1). Second, our main findings are robust to excluding extreme values of the dependent variables, last few days of data, or including location fixed effects in the analysis (Table S2 in section S4.2). Third, our results are largely insensitive to different exclusion criteria for initial periods (Tables S3 and S4 in section S4.3) . Fourth, a placebo test where weather variables in each location are permuted and shifted by a random number of days shows no effect of the weather variables in most of our model specifications, adding confidence that results cannot be attributed to mechanical features of the statistical model (Table S5 in section S4.4). Finally, we found a significant and qualitatively relevant negative effect of moon illumination on reproduction number (which also did not change other weather effects; see Table S7 in section S4.6). Absent theoretical explanations, we decided not to include that effect in the main specification, but we find it worth further exploration.

The associations between weather variables and transmission rates highlight the potential to project the risk of COVID-19 spread as a function of weather conditions. Of course, our results . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint are associative and extrapolating out-of-sample includes unknown risks. With that caveat in mind, one could calculate the weather components' contribution for any vector of weather variables based on Table 1 . Exponentiating that value (as we do in Figure 1 -B) leads to a multiplicative weather score which is associated with reproduction rate independent of the location-specific characteristics and responses. Such scores are therefore comparable across locations and the ratio between two scores offers a measure of relative transmission risk for two vectors of weather variables. We define "Relative COVID-19 Risk Due to Weather" (CRW) as the relative predicted risk of each weather vector against the 90 th percentile of predicted risk in our estimation sample, 1.33 (Figure 1-B) . The choice of this reference point is somewhat arbitrary and is made to make a value of 1 a rather high risk of transmission due to weather. A CRW of 0.5 reflects a 50% reduction in the estimated reproduction number compared to the reference weather condition. Formally: 

It is important to note that these scores do not reveal the actual values of reproduction number; that value is highly contingent on location-specific factors and policies, for which we have no data outside of the estimation sample. For example, the COVID-19 reproduction number in New York City is likely larger than a rural district in upstate New York. Our projections cannot inform the absolute risks for either location. CRW scores could be compared to inform relative risks due to weather (i.e., assuming all else equal) across locations or within a location over time. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . These projections use weather data from 2019-2020, averaged over a 15-day moving window, for 2020-2021 dates; as such, they include historical noise despite averaging. Many of these large cities go through periods of higher risk as well as reduced risk during the year. As discussed before, we cannot associate these risks with absolute reproduction numbers, and our estimates are likely conservative. Nevertheless, assuming typical reproduction rates in the 2-3 range, one needs CRWs below 0.3 to contain the epidemic based on weather factors, a condition rarely observed in our data. The website projects.iq.harvard.edu/covid19 provides these projections for the 1,072 largest global cities. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 8, 2020. 

Combining one of the most comprehensive datasets of COVID-19 transmission to date with weather data across the world, this paper provides evidence for the association of various meteorological variables with the spread of COVID-19.

While we find a stronger effect (A 3.1% (95% CI, 1.5% to 4.8%) reduction in ̂ for every degree increase) for temperatures above 25 degrees Celsius, the relatively mild slope of temperature effect below 25 degrees suggests many temperate zones with large population density may face larger risks, while some warmer areas of the world may experience slower transmission rates. For example, the estimated associations may partially explain the smaller sizes of outbreaks in southern Asia and Africa to date.

We also show that the challenge of estimation due to detection delay is significant and at best conservative estimates may be expected from standard regression methods. This observation may partially explain inconclusive and inconsistent prior results. Overcoming this challenge not only requires careful estimation of true infections based on reported cases, but also may benefit from methods that correct for the resulting bias, e.g., indirect inference [43] . Other limitations of the study include: the variance in reliability and availability of transmission data across the globe; oversampling from U.S. locations; use of last year's weather data to project next year's outcomes; and use of correlational evidence to inform out-of-sample projections.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint Despite these limitations, consistent results using various conservative specifications and validation tests are promising indications of true impacts of weather conditions. These effects may offer partial relief to some regions of the world during the summer, but it is important for policymakers and citizens to remain vigilant in their responses to the pandemic, rather than assuming that summer climate naturally prevents transmission. In fact, much of the variation in reproduction number in our sample is explained by location-specific responses, not weather. Ultimately, weather much more likely plays a secondary role in the control of the pandemic.

The authors declare no competing interests.

Ficetola, G.F. and D. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. Gourieroux, C., A. Monfort, and E. Renault, Indirect inference. Journal of applied econometrics, 1993. 8(S1): p. S85-S118.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. .

All code and data for this research is available at https://github.com/marichig/weatherconditions-COVID19.

Case and coordinate data were first taken from JHU's published case reports, available at https://github.com/CSSEGISandData/COVID-19, which covered all locations in the United States and 258 of the 590 locations from outside the U.S., including breakdowns for Canada into 10 states/territories, and Australia into 8 states. The remaining locations were made up of 301 Chinese cities and 31 Iranian states, and for these case and coordinate data were taken from the Chinese Center for Disease Control and Prevention, Provincial Health Commissions in China, and Iran's state level reports.

Some locations included in the U.S. case reporting data were dropped from the main analysis. Namely: Errors in the reported coordinate data were also identified and resolved manually. (For instance, Congo-Brazzaville was reported to have the same coordinates as Congo-Kinshasa.) With this coordinate data, weather data is collected primarily through World Weather Online (WWO), which provides an API for data collection -the Python "wwo-hist" package <https://pypi.org/project/wwo-hist/> was used to access this API. Historical weather data were collected for each day between 1/23/2019 thru 4/22/2020, with data from 2019 being used for future projection.

The following variables were collected: maximum daily temperature, Celsius; minimum daily temperature, Celsius; average daily temperature, Celsius; precipitation, millimeters; humidity, percentage; pressure (atmospheric), millibars; windspeed, kilometers per hour; sun hours (i.e., hours of sunshine received); total snowfall, centimeters; cloud cover, percentage; ultraviolet (UV) index; moon illumination, percent (i.e., percentage of moon face lit by the sun); local sunrise and sunset time; local moonrise and moonset time; dew point, Celsius; "Feels Like", Celsius; wind chill, Celsius; wind gust (i.e., peak instantaneous speed), kilometers per hour; visibility, kilometers; and wind direction degree, clockwise degrees from due north. A description of each variable is available at https://www.worldweatheronline.com/developer/api/docs/historical-weather-api.aspx. The ultraviolet (UV) index data was not consistently reported from WWO, and was instead gathered using OpenWeatherMap <https://openweathermap.org/> and the Python "pyowm" package <https://pypi.org/project/pyowm/>.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. We interpolated over any missing entries in the temperature and UV data provided. The reported temperature data were missing for most (but not all) locations 9/15-17/2019, 10/22/2019, 11/27/2019, and 12/15/2019, which were then interpolated using five-day moving averages. UV data were missing for less than 0.1% of date-location pairs, with the main gaps occurring on 6/2/2019, 8/13/2019, 12/2/2019, 2/18/2020, and 2/21/2020, which were interpolated using three-day moving averages. This averaging should not impact the analysis given that most of the above dates fall outside the pandemic's date-range.

Population density data was sourced from Demographia (Cox., W, Demographia World Urban Areas, 15 th Edition, The Public Purpose), which provided data for urban areas with population greater than 500,000; the United States Census (U.S. Census Bureau, data.census.gov/cedsci); the Iran Statistical Centre; the United Nation's Projections; City Population (citypopulation.de); and official published estimates for countries not covered by these sources. For data sourced from Demographia, the population densities reported are urban densities, whereas other sources primarily reported overall density (spanning urban and non-urban areas). The urban and overall densities are largely on different orders, which weakens the inclusion of population density as an independent variable.

Reported data on daily detected infections for COVID-19 do not reflect the true infection rate on a given day, rather, it lags the true infections due to both the incubation period (during which patients are asymptomatic and less likely to be tested), and the delay between onset of symptoms, testing, and incorporation of test results into official data. We need estimates for the true infection rates for each day to calculate the daily reproduction number (i.e., ̂( )), therefore identifying the lag structure between measured infection ( ) and true infection ( ), which we call "Detection Delay" is key to back tracking from measured infection to estimates of true infection rate.

Prior research has provided several estimates for subsets of overall Detection Delay. Incubation period, the time between infection to onset of symptoms, has been estimated by several teams. Li and colleagues [1] , using data from 10 early patients in China, find the mean incubation period to be 5.2 days, and the delay from onset to first medical visit to be 5.8 days for those infected before January 1 st and 4.6 days for the later cases. Lauer and colleagues [2] use data from 181 cases to estimate incubation period with mean of 5.5 and median of 5.1, and offer fitted distributions using Lognormal, Gamma, Weibull, and Erlang specifications. In a supplementary graph, they also provide a figure that includes the lags from the onset of symptom to official case detection. Guan et al. [3] use data from 291 patients and estimate median incubation period of four days with interquartile range of 2 to 7 days. Linton and colleagues [4] use data from 158 cases to estimate the incubation period with a mean (standard deviation) of 5.6 (2.8) days. This delay goes down to 5 (3) when excluding Wuhan patients. They also report onset to hospital admission delay of 3.9 (3) days for living patients (155 cases). They provide their full data in an online appendix, where we calculated the onset to case report lag with mean of 5.6 days, median of 5, and standard deviation of 3.8 days. A New York Times article [5] reports that the Center for Disease Control estimates the lag between onset of symptoms to case detection to be four days. Finally, a Bayesian estimation of the detection delay using abrupt changes in national and state policies by Wibbens and colleagues find the mode of the delay to be 11 days and ranging between 5 and 20 days [6] .

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. Overall, these findings are consistent and point to an incubation period of about 5 days and an onset to detection lag of about the same length. We use Lauer et al. estimates for a Lognormal incubation period with parameters 1.62 and 0.418 (leading to mean (standard deviation) of 5.51 (2.4) days), and another Lognormal distribution with parameters 1.47 and 0.52 (resulting in 5 (2.8) days) for onset to detection delay. Combining these two distributions using 10 million Monte-Carlo simulations, we generate the following Detection Delay lag structure that is used in the analysis. The code calculating this distribution is found at <https://github.com/marichig/weather-conditions-COVID19/>. 

Here we develop an algorithm that provides a more accurate estimation of true exposure than a fixed shift in reported data or averaging data over a time period. We later compare our algorithm's performance with the simpler, more common, methods. We find that accurate estimation of effects of weather variables hinges directly on accurately estimating true infections, making the algorithm in this section key to overall estimation.

Using the delay structure specified in the previous section, one can estimate true infection rates using various methods. The most common solution is to just shift the official infections based on the average, median, or mode of the Detection Delay (9 to 11 days). This approximation may be fine in steady state, but becomes more inaccurate when estimating time series with exponential growth: the detected infections today are more likely to be from (the many more) recent infections than (the fewer) 10 days ago.

The main objective of our algorithm is to find better estimates for the true infection. We first calculate the expected number of daily detected cases, given a series of actual infections unknown in the real world. Call the actual infection on day , ( ), and the detected infections on day , ( ). The following equation would relate the two constructs:

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint Where (. ) takes the expectation on detected infections, (. ) is the probability distribution for the detection delay estimated in section 2 of appendix, and the index ranges between 1 to = 17 days to account for different delay lengths. This equation does not account for test coverage, but as discussed below test coverage cancels out of the final reproduction number calculations, and as such only impacts variability of outcomes and otherwise has limited impact on results. Note that this equation is under-specified: for one value of the known measure , one has to find up to values of the unknown (in our case, no detection is expected in the first 4 days, so L-4=13 values of contribute to a value of ( Figure S4) ). However, given the overlap on 's determining subsequent values, the system of equations connecting and values for time series extending over days would include known values (for ) and + unknown values. Different approaches could then be pursued to find approximate solutions for this system of equations.

Using exact Maximum Likelihood suffers from intractability of specifying the Likelihood for highly correlated Poisson distributions (Poisson is a natural alternative in this case). We compared two alternatives, one using Normally distributed approximations for as a function of , and another using a direct minimization of the gap between values and their expectation. The latter proves both simpler conceptually and more accurate in synthetic data, so we picked that for the main analysis:

Given the under-specification of original system of equations, this optimization will include many solutions. To identify a more realistic solution from that set, we add a regularization term that penalizes the gap between subsequent values for , specifically, we use the following optimization:

The solution to this optimization can be found using standard quadratic programming methods, allowing for fast and scalable solutions. We conducted sensitivity analysis to find the regularization parameter, λ, offering the best overall ability of the algorithm to find true infections in synthetic data. The algorithm that works well is with λ values in the 0.1 to 0.5 range and not very sensitive to exact value; we used a value of 0.2 in our analysis. An implementation of this code in Matlab is available from <https://github.com/marichig/weather-conditions-COVID19/>.

For each location in our dataset, we used this algorithm to estimate the true infections (̂( ) = ( )), on a daily basis, starting from 17 days before the first detected infection, and stopping 5 days before the last day with data (because only infections from 5 days or further back could be found in current measures of infection; see the Detection Delay distribution (Figure S4) ). These values were then used to create the dependent variable, ̂( ), as discussed in the body of the article:

CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. We recognize that not all infections are reported, and a large fraction may remain unknown. Assuming that only a fraction f (0≤f≤1) of actual infections are reported, IM would be f of total infections that could have been detected on a given day, and estimated̂( ) will be the fraction f of true infections as a result. While these under-estimations are likely very significant if we cared about absolute values of ̂( ), note that ̂( ) values show up both in the numerator and denominator of ̂( ) equation. Therefore, multiplying both by a fixed constant makes no difference in the estimated ̂( ).

We also recognize that, early on during the infection, f may increase with expanding test capacity until reaching a steady state value. Therefore, as later discussed, we drop the first few data points for each region and check the sensitivity of the result to dropping fewer or more days. Finally, our synthetic analysis (section 5.2.2.3, Experiment 10) shows results are robust to various trajectories for f over the course of epidemic.

We conducted six different set of sensitivity tests to assess the robustness of our findings to various assumptions and boundary conditions 1 . Here is a summary of the results, before we go into the details: (1) In our main specification we used a delay of τ=20 days from exposure to resolution to calculate reproduction number R0. Here we tested the robustness of our results over a spectrum of reasonable durations of delay from 15 to 25 days, finding no major impact on the results. (2) We tested outcomes under three additional exclusion criteria and specifications: exclusion of the last few days of data (for which true infection estimates may be less reliable), exclusions of top 1% R0 of our sample (which may be generated due to reporting issues), and inclusion of location fixed effects. Results are qualitatively robust under all these specifications.

(3) In our main specification we dropped the first 20 days since new infection exceeds 1 for each location to account for early-on changes in test coverage. Here we tested the robustness of our results to other exclusion periods, ranging from first 10 days to 30 days, finding no major impact on the outcomes (4) To exclude the possibility that our results are driven by mechanical features of our variable construction and model specification, we used a set of placebo weather variables, which are randomly shifted across locations and over a specific number of days, and re-estimated our main models using these placebo weather variables. We found few significant effects under these placebo tests. (5) In our main specification we chose a linear spline effect of mean temperature with a knot at 25 degrees. Here we tested how our results are sensitive to different choices of knots, finding the 25 degree provides the best balance. (6) Finally, we reported analysis that includes moon illumination as an additional independent variable to our main specification. We found consistent and significant negative effect of moon illumination on reproduction number. However, lacking any theoretical justification for this effect, we did not include this factor in our preferred model specification. 1 Same as in our main specification all models tested here used log(R0) as the outcome and included location specific linear-trends. All models excluded days with new infections <1 and first 20 days since new infection exceeds 1 for the first time in each location unless specified otherwise.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. Table S2 presents the results using our main specification with R0 calculated from 15, 20 and 25 days of delay respectively. The coefficients and significance level for each weather variable are largely unchanged and consistent across different durations, especially for air pressure, mean temperature, and interaction effect between humidity and mean temperature. Table S3 presents the results when we exclude top 1% R0, last 4 days of our data (19% of the total sample in our main specification), or when we include location fixed effects. The effect of wind speed, precipitation, air pressure, mean temperature and interaction between humidity and mean temperature are all robust to both exclusion criteria, while the positive effects of snow fall and diurnal temperature are no longer significant when we exclude the last 4 days of data. With the inclusion of location fixed effects, all of the weather effects become weaker, confirming our . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint intuition (from the synthetic experiments) that weather effects might be underestimated with the inclusion of location fixed effects. The main reason is that within region variation of temperature is less than cross-regional variation, and with fixed effect regressions, one loses the opportunity to fully utilize temperature variation in the data. However, even with fixed effect, the effect of wind speed, precipitation, mean temperature and interaction between humidity and temperature are still consistent and significant. Table S4 and Table S5 present the results when we exclude first 10, 15, 20, 25, and 30 days since new infection exceeds 1 for the first time in each location. Overall our main results are consistent and insensitive to different exclusion criteria, except when we exclude first 30 days of data for each location, where we would lose more than half of our sample compared with the main specification (i.e., exclude first 20 days), the effect of wind speed, precipitation, temperature become weaker and are no longer significant. It is possible that by constraining our . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint estimation on later periods, we are focusing on periods when lockdown and social distancing are fully in effect and thus there are few variations left in R0 that can be explained by weather. However, the coefficients for these aforementioned effects are still consistent and have the same sign. For example, from column 5 we estimated that with one degree increase in mean temperature after 25 degrees, the estimated R0 will still decrease by ~2%. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. 

We first randomly permuted weather variables across locations in our data, and then shifted all weather variables in each location to earlier periods by a specific number of days, where the number is randomly drawn from a uniform distribution U(0,300). We then performed the statistical analysis using these "placebo" weather variables. As shown in Table S6 , most of the weather effects are completely gone, especially in our main specification where first 20 days are excluded. The only exception is when we observe a positive and significant effect of humidity and temperature after 25 degrees when first 30 days are excluded, which have opposite signs from the results in our main conclusion. The results are likely to be purely driven by chance, but nevertheless we conclude that caution should be exercised when interpreting results from specifications dropping a large number of initial periods. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. 

In the main specification we used a linear spline effect of mean temperature with a knot at 25 degrees as it provides better fit than linear or quadratic effect of temperature. Here we test the sensitivity of our results to the choice of knots over a wide range of mean temperature from -15 degrees to 30 degrees. As shown in Table S7 , the temperature effect after the knot is statistically significant and much larger at 25 degrees (-.0318, p<.001) than knots at other degrees. Hence, we chose the knot at 25 degrees as our main specification. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. 

Finally, as an exploratory analysis we included moon illumination (percentage; range: 0-100) as an additional independent variable in our model. While all of our previous conclusions regarding other weather variables remain unchanged, we observed a significant and unneglectable negative effect of moon illumination: 1 percent increase in moon illumination is associated with 0.25% decrease in estimated reproduction number. This effect is robust to looking at different subsets of locations with different start dates of epidemic, so it cannot be explained based on mechanical artifacts of the timing of majority of locations. Overall, we lack a causal theory for why such an effect may exist. Moreover, as moon illumination is constant across locations in each day and exhibit cyclic behavior over each lunar month, it is possible that the moon illumination is confounded with other cyclic behavioral patterns occurring globally in our data, and thus we decided not to include it in the main analysis. Nevertheless, we reported it here to show other weather effects are robust to this additional inclusion and point to possible future avenues of investigation.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint

We conducted extensive synthetic analyses to inform the selection of a reliable statistical method and to build confidence in our final estimation method. These analyses could be divided into those focused on specifying the estimation method (5.2), and those designed to validate our method in a study where analyst was blinded to the true specification (5.3). After providing an overview in 5.1 we explain these two sets of analyses. Detailed codes are available at <https://github.com/marichig/weather-conditions-COVID19/>.

Before going into the details of the analysis, we review the main objectives and approach of this test and the main findings. Then in sections 5.2 and 5.3 we provide more details about the analysis.

• Approach: To build and validate our method and examine its sensitivity to different assumptions, we created synthetic data from simulated epidemics with several different assumed temperature effects on infection. The true exposure, the exact detection delays, and the temperature functions were hidden from our estimation method to assess the method's success objectively. Our objectives were two fold and we created a task allocation among researchers to meet those objectives: first researcher HR used the iterations of this process to improve our statistical estimation method and ensure that it was able to find temperature functions under various assumptions (Section 5.2.2); second, we used a more realistic individual-level model of infection (stochastic agent-based model) built by two other investigators (NG and MG) not involved in the first synthetic data analysis (used for method design) to assess if our statistician (RX) who was unaware of true functional forms or the new model structure could identify correct effects in this different simulation environment (section 5.3). This design addressed the risk that a method fine tuned on synthetic data may perform well under the assumed simulation setting but fail in other environments.

• Results:

The key finding from these experiments which are elaborated in details include: 1) The model specification we use can accurately identify correct weather impacts if true infection was observable; 2) In the absence of data on true exposure (which is the case in COVID-19 due to testing delays), however, estimation of weather impact becomes complicated, and many intuitive specifications, often used in other studies, fail to recover true impacts. This may be a serious challenge afflicting many attempts to identify the link between weather and COVID-19 transmission; 3) Our algorithm for uncovering the true exposure, and the specification we selected offer a potentially conservative but qualitatively informative view of the true underlying impacts; 4) Our preferred specification is robust to a few key uncertainties that may vary between simulated numbers and the actual epidemic; and 5) A statistician blinded to true data generating process was able to use this method to identify true weather effects from synthetic data generated from a different, more detailed, agent-based model of COVID-19 epidemics. Given these results from the analysis of the synthetic data, we can have more confidence in the analysis using the actual data.

Our approach consists of building a simulation model of epidemic to generate synthetic data (with known weather impact functions) followed by estimating various statistical specification to assess their ability in identifying the true functional forms.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint

We used a simple SIR-based simulation model to generate synthetic epidemics. This model was applied across various locations (with different vectors of weather variables ( ( )) impacting epidemic curve based on ( ( ))) to generate the raw data going into alternative statistical methods to identify the function below. Equations of the simulation model are presented in Table S8 . Table S8 . Equations of the SIR-based stochastic simulation model of epidemics Iterations completed on a daily basis until epidemic ends or until t=50 is reached.

( )

New infections assumed Poisson based on susceptible stock ( ), Infectious stock ( ), force of infection ( ), and weather effect (g(W(t))), primarily focusing on mean temperature. Adding to future daily recovery rates ( ) to incorporate the recovery of all those infected today using a multinomial distribution. is the recovery delay distribution, assumed Poisson with mean of 20 days. The "+=" operator adds to the existing vector on the left-hand side values on the right hand. Adding to future measurement of detected cases ( ) to incorporate fraction f of those infected today in future detection data. A multinomial distribution is used following the Detection Delay lag structure ( ; see section 2 of Appendix). In our baseline model we use = 0.1, and test more complex functional forms where f increases over time from zero to a maximum of 0.3 in response to measured infections in 3 different scenarios.

Updating next period stock of infectious based on recovery and new infections. Initial infectious population of 3 is assumed so that few epidemics die out due to stochasticity. 

We use different g functions, including quadratic and linear forms. We focus on temperature as the primary variable that is read from data and input into the synthetic data.

In our main synthetic analysis, we use:

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint ℎ( ) = 40 + ( (1: 50) ) 50

This creates a positive correlation of 0.47 between temperature that affects transmission rates and the reproduction number for a specific location.

We use mean daily temperature data from January 1 st 2020 for t=1 for all locations, and continue accordingly.

Using these specifications, we then conducted multiple experiments to identify a viable specification. In each experiment we simulated the model for 20 iterations (with different random realizations) for a sample of 1,000 locations randomly drawn from our 3,739 locations with their actual temperature data feeding into .

We summarize the results from each experiment using a graph of the shape of the estimated relationship between temperature and natural logarithm of reproduction number, i.e., ( ( )) and compare that with the true relationship (the thick dashed line in figures below). The success measure for our method is to have the estimated relationship from our method close to the thick dashed line. Effects falling between true curve and a horizontal line would be conservative, and those falling outside this range may be misleading. Our actual temperature data in the simulation period is bounded to smaller ranges (90% of data falls between -10 and 20 degree Celsius) than reported in these figures. Therefore, extrapolations outside this range are not necessarily indicated by the data, rather, emerge from the estimated functional forms. Nevertheless, we graph a much wider temperature range (-30 to 50) to highlight the risks of such extrapolation. We also report the means of estimated parameters for ( ( )), 95% confidence interval, Coefficient of determination (r 2 ), and sample size for each experiment.

In the rest of this section, unless specified in the top row of a figure, we focus on quadratic equations for ( ( )). The main specification uses ( ( )) = −0.01 − 0.002 2 . We also conduct sensitivity analysis to other functional forms for ( ( )). We summarize the results of these synthetic analyses under 10 experiments, which could be categorized in three subsets. Experiments 1 to 3 (sections E1-E3 below) introduce the main challenges in correctly associating weather with reproduction number, concluding that while not an impossible task, the best one can expect from similar efforts may be to find estimates that are conservative but not misleading. Next, in experiments E4 to E6 we show support for the chosen statistical specification against other plausible alternatives. Finally, experiments E7 to E10 show the robustness of the preferred specification to a variety of assumptions.

E1) Endowed with true, deterministic infections, the method finds the correct impact of weather. Relaxing either assumption deteriorates results.

In Figure S5 , we compare three different scenarios. In the first two (A and B), the estimation method is provided with the actual true infections (rather than those estimated using the method . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint discussed above). Moreover, the first experiment (A) also assumed deterministic infection rate (that is, ( ) = ( ( )) in Table S ). Plot C shows results using our baseline specification: estimating true infections using quadratic programming, including location-specific trend lines but not any fixed effects, dropping days with estimated exposure below 1, as well as the first 20 days after the estimated exposure first reaches 1, and excluding the outlier estimated reproduction numbers (those above 95%). The two assumptions on using deterministic infection and true infections in the first two experiments are not realistic. Instead, they inform the challenges to unbiased estimation of reproduction number due to stochasticity of infections (comparing plots A and B) and estimation of true infections from reported data (comparing plots B and C).

Inspection of these results reveals two major challenges to estimating reproduction number: i) Randomness in infection rate leads to weaker identified effects. ii) The imperfect identification of true infections from reported cases significantly reduces the magnitude of estimated effects. Both of these effects generate a bias towards null estimated effects, even when true effects are very significant. As the experiments reported in the following sections show, our baseline model, despite its conservative estimates, might be among the best available options to find estimates for the impact of weather on transmission rates.

Note that the true linear and square terms in ( ( )) are reported in the title for each panel in the following figures. Figure S5 : Impact of stochasticity in infections and imperfect estimation of true infections on quality of estimated parameters.

Before focusing on the main specification, we report another set of experiments that show the performance of estimation method with true infections under three other functional forms ( Figure  S6 ). These experiments differ from the first set (E1) as they introduce location fixed effectsincluding location fixed effects, as expected, would add value if true infections were available.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org /10.1101 /10. /2020 This is in contrast with the situation when we use estimated infections, which will be reported under section 5.2.2.2 below.

The main observations in this set of experiments are: i) True infection rates, even including randomness in infection, would offer close estimates for the underlying impacts of temperature on reproduction number; ii) The effects are potentially closer to true values because of the use of fixed effects (this can be seen in comparing panel C in these experiments with panel B in Figure S5 ); and iii) The effects remain slightly conservative, i.e., weaker than true effects (shown in the title for each panel). E3-Using proposed estimation method provides conservative, but largely consistent, estimates.

In the next set of experiments, we test the main estimation method, which uses estimated exposures, to find the effect of temperature under the different functional forms simulated above. In these experiments, we continue to use only a location-specific trend, but no fixed effects, and the same inclusion criteria used in experiment 1. Overall, estimated effects, as shown in Figure S7 , are qualitatively consistent with the true functional forms, but also show important deviations: i) The results include some biases in estimated parameters, including quadratic terms that are not in the main function; ii) Results are somewhat conservative (pointing towards null effects) in the regions of the temperature actually covered by W data. Based on these observations, the use of the estimation method should include appropriate caution: the results may under-value the true magnitudes of temperature effects, but may also include unknown biases depending on the true shape of the function.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . Figure S7 : Performance of preferred estimation method with realistically available data in identifying different functional relationships between weather and reproduction number.

Our preferred specification uses the estimated infections based on the quadratic programming method discussed in section S3. Here we compare those results against a simpler specification that shifts back detected infections each day by 10 days to infer the true infection rate on each day. Results are shown in Figure S8 . We assess this alternative under the same functional forms discussed in experiment 3 (E3) and thus results are directly comparable with that experiment. In short, the simple shift method offers results that are largely worse than those in the preferred specification and may include problematic biases. These results motivated the search for alternative estimation methods to more precisely identify the true infections. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. Figure S8 : Impact of using a simple shift of measured infections on performance of estimation under various functional forms.

In this experiment, we compare three settings in which we exclude both fixed effects and trends ( Figure S9 , panel A), exclude trends but include fixed effects (panel B), and include both fixed effects and trends (panel C). These experiments are directly comparable with our preferred specification where trends are included but fixed effects are not (Panel C in Figure S5 ).

The overall performance is far from perfect in all cases. Moreover, whereas fixed effects typically help with reducing the bias in regression, the combination of both fixed effects and location specific trends offers the most biased results (Panel C). In that setting, two different location-specific parameters absorb much of the variations in weather between and within locations, and combined with errors in the identification of true infections from reported data, the estimates for weather function become unreliable. Removing either of the fixed effects improves the correspondence between true functional forms and actual estimates, leading to conservative but not misleading results. This leaves us with a choice between three specifications that exclude at least one of the fixed effects and/or location specific trends.

We decided to go with inclusion of trends as our main specification. This choice was made with two considerations. First, having collected population density data, we could use those to provide some control for potential correlations between temperature and basic reproduction number due to variations in population density with climate. Second, while our W vector in simulations starts from early January, in practice the more positive trend in temperature during the spread of epidemic later in winter and early spring (when most of actual epidemic data is generated) is likely to correlate more strongly with behavioral and other responses that temper down the spread in each location. Thus, we would be worried that excluding location trends would lead regression results to pick up that spurious correlation and inflate the impact of temperature, creating illusory and misleading results. We therefore opted to focus on the more conservative estimates coming from our preferred specification. In practice, however, the inclusion or exclusion of fixed effects had limited impact on our empirical results (See Table S2 and section S4.2). This observation may suggest that our synthetic data generation process might have offered a more challenging setup for estimation than the empirical data generating process.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . Figure S9 : Comparing the use of no fixed or trend effects (A), only fixed effects without trends (B), and both location specific fixed and trend effects.

One potential issue in the current specification is that locations with larger outbreaks are weighted the same way as locations with smaller outbreaks. The data from larger outbreaks may well be more reliable, and the estimates of R calculated from that data thus more reliable. We assess if a correction for this issue can improve estimation results. To do so we use simulations to estimate how the variance in the dependent variable scales with the number of estimated daily true infections, and use that estimated variance to conduct weighted least square regressions. Results, reported in Figure S10 -panel A (panel B showing baseline replicated from experiment 1), show more bias and more dispersion compared to the unweighted regressions. Hence, we do not pursue this correction in our main specification. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. In the baseline simulations, the location-specific population size, which partially controls the speed of spread and how fast the slow down in reported cases will be realized, is normally distributed with mean of 500,000 and standard deviation of 200,000 (floored at 1,000 population). In Figure S11 , we compare that baseline (panel B) against standard deviation of 0 and 400,000. The impacts are small and largely negligible, suggesting that the variance in speed by which the spread slows down does not impact the findings much. 

In the baseline simulations (E1), the location-specific basic reproduction number had a mean of 3 and variance of 0.9 (normally distributed and floored at 0; but also positively correlated with average temperature in the location). In Figure S12 , we compare that baseline (reproduced in panel B) against standard deviations of 0 (Panel A) and 1.5 (Panel C) across locations. The impacts are small and largely negligible, suggesting that the variance in speed by which the spread slows down does not impact the findings much. Results suggest only minor sensitivity to this factor, with slightly better estimates in the absence of variance in basic reproduction number across locations. Given the wide range of basic reproduction numbers explored in this analysis compared to actual numbers, we do not think this issue would pose a robustness challenge to the main findings.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . 

Another variant on distribution of basic reproduction number considers the correlation between that parameter and the temperatures informing the weather function. In the baseline specification and all the experiments so far, we used a correlated version of that relationship (with a correlation of 0.47 between basic reproduction number and average temperature; see simulation model specification; Section 5.2.1). Here we compare that setup (reproduced in Figure S13 , Panel B) with the uncorrelated version where basic reproduction number is independently drawn for each location with mean of 3 and standard deviation of 0.9 (Panel A). Results are hardly sensitive to this potential correlation. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint Figure S13 : Impact of including correlation between basic reproduction number and average location-specific temperature (baseline; reproduced in Panel B) vs. having no correlation (panel A).

The simulation model so far assumed a constant test fraction of f=0.1, that is, in expectation only 10% of infections were detected. In practice this ratio may change over time as test capacity ramps up in response to infection measures. Here, we explore results under three such ramp up scenarios. In Figure S14 -panels A-C, the following ramp up scenarios are assumed: A) = (0.2,0.001 ); B) = (0.2,0.05Log 10 ( + 1)); and C) = (0.2,0.01√ ). Overall, results are rather insensitive to these very different test fraction numbers, suggesting robustness to this consideration. 

The key finding from these experiments include: 1) The model specification we use could identify correct weather impacts if true infection was observable; 2) In the absence of that data however estimation of weather impact becomes complicated, and many intuitive specifications fail to recover true impacts. This may be a general challenge afflicting any attempt to identify the link between weather and COVID-19 transmission;

3) The specification we selected may offer a conservative but qualitatively informative view of the true underlying impacts. For example, in most estimations the quadratic term is estimated at about 20% of its true value; and 4) This specification is largely robust to a few key uncertainties that may vary between simulated numbers and the actual epidemic.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint

In this last step of synthetic data analysis, our objective was to use a more detailed individuallevel model of infection (stochastic agent-based model) of interacting individuals, to create synthetic data of reported cases, distort the outcome with a delay function to represent test/report, and examine if our statistical methods are still capable of finding our weather functions. Synthetic data creation was done by two of our co-authors (NG and MG) who hided their assumed temperature effect functions from our statistician (RX) whose task was to discover the assumed temperature effect function.

To that end, we created an agent-based simulation model of infection, and simulated the model for 100 hypothetical towns of different populations, different R0's (potentially due to different contact rates and population density), different start days of infection, and different temperatures. We started from the generic individual model of infection (available on the NetLogo's library) that is consistent with the basic SIR model at individual level. We modified the model using parameter values that are more consistent with COVID-19, and included several features needed to import and export data to the model. We modified the infection function to include the temperature effect on the probability of infection. We used three major scenarios for temperature effect (inverse u-shaped effect, linear increasing effect, and no-effect (placebo)). The scenarios included actual temperature values coming from a sample of 100 regions from the real-world data. The ABM model's output was generated using a detection delay with Poisson distribution with mean of 10 days. This data was used to estimate true infections with the method discussed in section 3. The model codes are available at https://github.com/marichig/weather-conditions-COVID19/. Figure S12 shows an example of creating synthetic data (scenario 1, explained in the following) with Panel A showing the true cumulative infections and panel B showing the reported values.

A B Figure S12 . An example of the process of generating synthetic data with an assumed temperature hidden from our statistician. True cumulative cases (A) and cumulative number of confirmed cases (B).

The tests (scenarios) included quadratic (S1), no-effect (S2), and positive linear effect (S3). For all scenarios we tested models both including fixed effects (Si1) and those without (Si2). In non- Cumulative confirmed -Scenario 1 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint fixed-effect tests, in order to make a control variable consistent with our main regression, we added one extra variable, a hypothetical variable of "population density", to represent variations in locations correlated with basic reproduction number. In this setup "population density" was correlated with the basic reproduction number excluding temperature-related factors ( = .8).

Our statistician did not know the true temperature function, so they used both linear and quadratic terms to map the predicted temperature effect in all cases, even when the effect was linear.

Results are graphically summarized in Figure S13 ; for each scenario the results are compared with the "true" function of temperature (darker lines). Overall our statistician was able to correctly estimate the sign and magnitude of temperature effect in all cases, while the effect was generally underestimated further supporting the proposition that the method offers conservative estimates (e.g., in Figure S13 , left panel, compare S11, and S12 curves with the true effect of "S1-true effect"). . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint

In this section we report graphs of projected Covid-19 Risk Due to Weather (CRW) at 4 different time periods in the coming year. These projections use a 15-day moving window to average different weather variables in the previous year (2019-2020), and use those averages as the predictor for the coming year (2020-2021). Daily projections year-round, both for these global locations and the largest 1072 cities across the world are available at: https://projects.iq.harvard.edu/covid19. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 8, 2020. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 8, 2020. . https://doi.org/10.1101/2020.05.05.20092627 doi: medRxiv preprint

Rapid Expert Consultation on SARS-CoV-2 Survival in Relation to Temperature and Humidity and Potential for Seasonality for the COVID-19 Pandemic

Absolute Humidity and Pandemic Versus Epidemic Influenza

Influenza Virus Transmission Is Dependent on Relative Humidity and Temperature

Global seasonal occurrence of middle east respiratory syndrome coronavirus (MERS-CoV) infection

Climate factors and incidence of Middle East respiratory syndrome coronavirus

Stability of Middle East respiratory syndrome coronavirus (MERS-CoV) under different environmental conditions

A climatologic investigation of the SARS-CoV outbreak in Beijing

Effects of air temperature and relative humidity on coronavirus survival on surfaces

Aerosol and Surface Stability of SARS-CoV-2 as Compared with SARS-CoV-1

Persistence of coronaviruses on inanimate surfaces and their inactivation with biocidal agents

Stability of SARS-CoV-2 in different environmental conditions. The Lancet Microbe

Stability of SARS-CoV-2 in different environmental conditions. medRxiv

Temperature dependence of COVID-19 transmission

Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia

The Incubation Period of Coronavirus Disease

From Publicly Reported Confirmed Cases: Estimation and Application

Clinical Characteristics of Coronavirus Disease 2019 in China

Incubation Period and Other Epidemiological Characteristics of

Novel Coronavirus Infections with Right Truncation: A Statistical Analysis of Publicly Available Case Data

Coronavirus Diagnoses Are Lagging Behind the Outbreak

Projected COVID Infections, Deaths, and Local Social-Distancing Restrictions. Deaths, and Local Social-Distancing Restrictions

Relative COVID-19 Risk Due to Weather (CRW)

We thank Peiyi Li, Hongjin Xu, and Wenhan Dai who assisted us in collecting and evaluating data from China, and Maedeh Moghadam who helped us with assembling data from Iran.

No funding was used to conduct this study.