key: cord-0873708-42liu9k5 authors: Ren, Hongyan; Zhao, Lu; Zhang, An; Song, Liuyi; Liao, Yilan; Lu, Weili; Cui, Cheng title: Early forecasting of the potential risk zones of COVID-19 in China's megacities date: 2020-08-10 journal: Sci Total Environ DOI: 10.1016/j.scitotenv.2020.138995 sha: 6991c189ea274364c77d13b4414d483d851dfaaf doc_id: 873708 cord_uid: 42liu9k5 Recently, the coronavirus disease 2019 (COVID-19) has become a worldwide public health threat. Early and quick identification of the potential risk zones of COVID-19 infection is increasingly vital for the megacities implementing targeted infection prevention and control measures. In this study, the communities with confirmed cases during January 21–February 27 were collected and considered as the specific epidemic data for Beijing, Guangzhou, and Shenzhen. We evaluated the spatiotemporal variations of the epidemics before utilizing the ecological niche models (ENM) to assemble the epidemic data and nine socioeconomic variables for identifying the potential risk zones of this infection in these megacities. Three megacities were differentiated by the spatial patterns and quantities of infected communities, average cases per community, the percentages of imported cases, as well as the potential risks, although their COVID-19 infection situations have been preliminarily contained to date. With higher risks that were predominated by various influencing factors in each megacity, the potential risk zones coverd about 75% to 100% of currently infected communities. Our results demonstrate that the ENM method was capable of being employed as an early forecasting tool for identifying the potential COVID-19 infection risk zones on a fine scale. We suggest that local hygienic authorities should keep their eyes on the epidemic in each megacity for sufficiently implementing and adjusting their interventions in the zones with more residents or probably crowded places. This study would provide useful clues for relevant hygienic departments making quick responses to increasingly severe epidemics in similar megacities in the world. • COVID-19 infection still poses increasing threat to public health in the world. • We explored the feasibility of Maxent models in identifying the potential risks. • Socioeconomic factors affected the spatial distribution of potential risk zones. • Dominant influencing factors on potential risk zones varied in the megacities. • Maxent models were suitable for early identifying potential COVID-19 risk zones. a b s t r a c t a r t i c l e i n f o Involving six continents, the outbreaks of the coronavirus diseases 2019 were still raging and have become a global public health concern (WHO, 2020a (WHO, , 2020b (WHO, , 2020c . It was considerably urgent for the countries or regions to implement effective strategies for containing the increasingly severe epidemics as soon as possible. Many epidemiological and clinical studies have pointed out that the COVID-19 infection could be characterized as a person-to-person transmission, especially in the community in some affected geographic areas during its incubation period (Chan et al., 2020; Huang et al., 2020; Li et al., 2020a; Pongpirul et al., 2020; Yu et al., 2020) . The current critical and effective prevention and control strategies are to block person-toperson transmission and to avoid being exposed to this virus because there is currently no effective vaccine or specific therapeutic drug to prevent this disease (Deng and Peng, 2020; Wang et al., 2020) . Numerous reports have accomplished early forecasting of the potential COVID-19 outbreaks' situations on the size and the duration to support making effective infection prevention and control strategies in China and the other countries using various models Li et al., 2020a; McBryde, 2020; Roosa et al., 2020; Sun et al., 2020; Wu and McGoogan, 2020; Xu et al., 2020) . Furthermore, the identification of the potential risk zones or regions of infectious disease and their influencing factors is equally meaningful for hygienic authorities to precisely implement effective infection prevention and control measures on a fine-scale (Gao and Cao, 2019; Li et al., 2017) . However, it was seldom concerned in previous investigations. Due to much larger population sizes and more active socioeconomic vitalities, China's megacities were confronted with greater stresses from the COVID-19 outbreaks. Therefore, our study utilizes the ecological niche models (ENM) to identify the potential risk zones of the COVID-19 infection from January 21 to February 27, as well as their predominant influencing factors in Beijing, Guangzhou, and Shenzhen. This study would supply useful clues for local hygienic authorities deciding where to prioritize effective interventions on a fine scale. 2.1. Data collection and processing 2.1.1. Specific epidemic data Beijing, Guangzhou, Shanghai, and Shenzhen were well-known as four megacities for their much higher overall strength and top-ranking competitiveness in China. These megacities are not only the leading and core of China's economy but also pose global influences on world economies. In this study, Shanghai was not included for the data missing of this city. During the COVID-19 outbreak, the number of laboratoryconfirmed cases by detection of virus nucleic acid, as well as clinically diagnosed by Diagnosis and Treatment Program of 2019 New Coronavirus Pneumonia by China's National Health Commission (NHC), were day by day issued by hygienic authorities (Table 1) ). In this study, the location (name of the community) of each confirmed case in January 21-February 27 was transformed into a spatial point layer for the specific epidemic data in these megacities using the geocoding tool (http://www.gpsspg.com/xGeocoding) and ArcGIS 10.3 software (ESRI, Redlands, CA, USA). At present, more detailed and evidential acknowledges, like the means and modes of transmission, the transmissibility, as well as its risk factors, are still to be well explored through enhanced surveillance and further investigation on this disease (Wong et al., 2020; Wu and McGoogan, 2020; Zhu et al., 2020) . However, several recent epidemiological investigations on the COVID-19 have pointed out that personto-person transmission caused by confirmed cases and suspected cases indeed exists at the community level in some affected geographic regions during its incubation period (Chan et al., 2020; Huang et al., 2020; Li et al., 2020b; Pongpirul et al., 2020; Yu et al., 2020) . Here, we mainly considered the place with a crowded population or large floating population as an important influencing factor of this infection. Nine socioeconomic factors were collected and categorized into four types, i.e., population density, floating population or demands of public transportation (bus stops, subway stations, and length of roads), demands of daily life (rent of rental houses, shopping malls, and supermarkets or convenient stores), and medical resources (major hospitals, and appointed hospitals or fever clinics for diagnosing the COVID-19 infection). Furthermore, these factors were indexed by the densities on the 1Km × 1Km scale across Beijing, Guangzhou, and Shenzhen City. The sources, types, and processing of the above variables are shown in Table 1 . Fig. S1A -C (supplementary material) presented the spatial distribution of above nine socioecnonomic conditions in three megacities. Socioeconomic variables in Beijing and Guangzhou presented similar spaital patterns that were mostly distributed in the their central regions (districts). In comparison, several socioeconomic conditions (including bus stops, subway stations, shopping malls, supermarkets, hostpitals, and appointed hospitals) distributed evenly relatively across Shenzhen that was featured by relatively small urban size, quick and huge achievements of urbanization and industrialization in the past forty years. Meanwhile, the rest (i.e., length of roads, average rental price, population density) displayed spatial clustering in its central (administrative, social, and economic) regions. Ecological Niche Models (ENMs), which predict the distribution of species with niche theory (Phillips et al., 2006) , have been widely applied to estimate the risk of disease based on known case locations and a set of environmental variables that describe some of the factors that likely influence the spread of the epidemic (Aguiar et al., 2018; Ardestani and Mokhtari, 2020) . As a typical ENM (https://biodiversityinformatics.amnh.org/open_ source/maxent), the Maxent model was employed in our study to identify the potential risk zones of COVID-19 infection, as well as their predominant influencing factors. Fig. 1 illustrated the whole process of the Maxent model in detail. A stable and efficient result was obtained through 10 replicates and the cross-validation method that all the samples were randomly divided into ten folds, and each fold, in turn, used for test data. In this way, 90% of the sample data was used for training, and the remaining 10% of data was employed for testing each time. As a result, three Maxent models (Model 1, Model 2, and Model 3) were respectively built on the basis of the epidemic data during 21st Jan -3rd Feb, 21st Jan -8th Feb, and 21st Jan -12th Feb for each megacitiy. The performances of the Maxent models were assessed in terms of the area under the curve (AUC) of receiver operating characteristic (ROC) (Ardestani and Mokhtari, 2020) . A higher AUC value represents a more reliable prediction being away from a random distribution ranging from 0 to 1. Often, there was an AUC-division standard as below: excellent (0.9-1.0), good (0.8-0.9), acceptable (0.7-0.8), Bad (0.6-0.7), and insufficient (0.5-0.6) (Greiner et al., 2000; Li et al., 2017) . Furthermore, the output values (predicted risk results as logistic output) from the models were reclassified into five grades: 1 (0-0.1), 2 (0.1-0.2), 3 (0.2-0.4), 4 (0.4-0.6), and 5 (0.6-1), among which much higher risks (grade 4-5) were focused in this study. Besides, these models were further respectively validated by the epidemic data of infected communities in 6th-8th Feb, 9th-12th Feb, 13th-27th Feb, and 6th-27th Feb, resulting in a set of percentages of infected communities covered by the potential risk zones. Besides, the percent contributions which were somewhat equal to the influences of the socioeconomic variables on the risk distribution. The response of the risk distribution to these independent variables were derived from these models. In three megacities, the numbers of infected communities, average cases per infected community, and the percentages of imported cases from outside regions decreased from 21st Jan to 27th Feb (Table 2) . Meanwhile, the COVID-19 infections in Beijing seemed to be clustered at the community level for its larger average values of cases per community. Guangzhou and Shenzhen tended to be always heavily affected by imported infections because of their higher percentages of imported cases. Also, the infected communities in each megacity possessed different spatial patterns. As of 27th Feb, the majority of infected communities were spatially clustered in its central districts in Beijing and Guangzhou (Fig. 2) , respectively. By contrast, the infected communities seemed to be somewhat dispersedly distributed across Shenzhen (Fig. 2) . These results showed that the COVID-19 epidemic in each megacity was different from each other, although the COVID-19 infection has been preliminarily contained. With mean AUC values of 0.80-0.95, the Maxent models presented outstanding performances (Fig. 3) . All the infected communities were well covered by the potential COVID-19 risk zones derived from three models (Table 3 ). In particular, Model 1, based on the earliest epidemic data (21st Jan-3rd Feb), possessed comparable precision rates for the risk zones with grades 4-5 in each megacity, covering 65%-68% of the currently infected communities from 6th to 27th Feb. For the timeefficient consideration, earlier and reliably forecasting was achieved from Mode 1 for identifying the potential COVID-19 infection risks in each megacity. Fig. 4 presented the spatial distribution of potential risk zones of the COVID-19 in three megacities. On the 1 km × 1 km scale, the zones with much higher risks (grade 4-5) in Beijing were densely distributed in the central regions and were gradually and closely surrounded by the risk zones with grade 3-2 in the outer regions (Fig. 4) . Slightly differently, the potential zones with much higher risks in Guangzhou were mainly located in the central regions, with surrounding zones of grade 3-2 in the north and south directions (Fig. 4) . In comparison, the risk zones with grades 4-5 were mostly concentrated in the southwestern regions in Shenzhen (Fig. 4) . As a result, the seriousness of the potential COVID-19 infection in three megacities was different from each other. Shenzhen presented much higher values of area proportions of the risk zones (grade 4-5) than those of Beijing and Guangzhou (see Table S1 in supplementary material). Table 4 showed the percent contributions of nine socioeconomic variables to the COVID-19 infection risks derived from the Maxent models in Beijing, Guangzhou, and Shenzhen. With various percent contributions, these socioeconomic variables differed from each other. Some socioeconomic variables in the first group, possessing larger individual percent contribution than 20% to the potential COVID-19 infection risks, were considered as the predominant influencing factors (Table 4) . Moreover, the socioeconomic factors in the second group possessed individual percent contributions of 1%-10% while the rest socioeconomic variables with individual percent contributions of 0-1% were categorized into the third group derived from Model1. Besides, all the factors presented temporal variations of the percent contributions in three megacities among Model 1, 2, and 3. As illustrated in Fig. 5 , the potential COVID-19 infection risks derived from the Maxent models for three megacities made various responses to above predominant influencing factors. There were much higher risks (beyond 0.71, grade 5) of the COVID-19 infection in the zones with both larger population density (beyond 3400 persons per km 2 ) and a moderate number of supermarkets (3.54-29.99 per km 2 ) across Beijing. In Guangzhou, the zones with larger population density (beyond 1000 persons per km 2 ), more supermarkets than 0.11 per km 2 , and bus stops seemed to present much higher infection risks (grade 4-5). In Shenzhen, much higher risks of the COVID-19 infection tended to occur in the zones with a larger length of roads (0.42 km per km 2 ), higher rent of rental houses (N45.65 Yuan per m 2 ), and more bus stops (N0.92 per km 2 ). In general, much higher infection risks in these megacities were very likely to appear in the zones featured by large values of prevailing socioeconomic conditions for themselves. In the past tens of days, China's megacities, like Beijing, Guangzhou, and Shenzhen, were confronted with higher stresses from the COVID-19 outbreaks because of their much larger population and more active socioeconomic vitalities. Fortunately, the COVID-19 epidemics in these megacities have been preliminarily contained by implementing effective and precise strategies to date. In this study, we evaluated the spatial and temporal patterns of the timely issused communities with comfirmed COVID-19 cases before exploring the feasibility of the Maxent method in early identifying the potential infection risk zones in these megacities. Several notable findings were achieved and would provide a useful reference for the regions with increasingly local transmission making and implementing targeted containment efforts for fight against the epidemic of this disease. In the early periods of the COVID-19 outbreaks, many studies already have accomplished now-casting or early forecasting of the sizes, duration, or dynamics of potential outbreaks through various models Roosa et al., 2020; Sun et al., 2020; Wu and McGoogan, 2020) , by which some useful clues have been timely provided for hygienic authorities at various administrative levels in China and/or other countries/territories/areas. In comparison, our study identified the spatial distribution of potential infection risk zones on a fine scale (1 km × 1 km), its predominant influencing factors, as well as their area and population proportions, which is equally meaningful for hygienic authorities keep local residents from being exposed to these zones with higher potential COVID-19 infection risks in Beijing, Guangzhou, and Shenzhen. In particular, Model 1, based on the earliest epidemic data, presented comparable precision rates and excellent performances among three models assembling different timelines epidemic data (infected communities). That is to say that the ENM method could meet the timeliness and fine spatial scale of forecasting the raging outbreaks of this disease. In other words, it is feasible to employ the ⁎ Average values of cases per community were calculated for the periods of 21st Jan-3rd Feb, 21st Jan-8th Feb, 21st Jan-12th Feb, and 21st Jan-27th Feb, because the newly confirmed cases may belong to some previously infected communities. N/A: the numbers of imported cases in Guangzhou in these periods were not obtained. The second stage only covered Feb 6th-8th due to the missing detailed information of infected communities in 4th-5th Feb. ENM method for early forecasting potential COVID-19 infection risks on a fine scale in the countries or megacities suffering from this disease. Beijing, Guangzhou, and Shenzhen presented obviously different patterns of potential COVID-19 infection, including the severities, spatial heterogeneities, and their predominant influencing factors. There may be several reasons for these differences. The first is that the spatial distribution of selected nine socioeconomic variables was obviously different among three megacities for their own socioeconomic development ( Fig. S1A-C in supplementary material) , by which the differences of potential severities (either area or population proportions) of the risk zones, as well as their spatial heterogeneities could be well explained. Beijing (the Capital of China) and Guangzhou (the South Gate of China) are two major central megacities with the socioeconomic resources more densely distributed in their central districts. By contrast, Shenzhen is a relatively younger city that has underwent a quick socioeconomic development over the past four decades, leading to the sparsely distributed socioeconomic conditions across this city (Fig. S1C) . Secondly, the realistic situations of increasing local transmission featured by familiar clustering in Beijing and Guangzhou could be a reasonable explanation for that the potential COVID-19 infection risks were heavily influenced by population density, the number of supermarkets, and the number of bus stops (in Guangzhou only) in Table 4 . Also, the larger migrant floating population from other provinces (including Hubei Province) in Shenzhen who mainly resided in the rental houses, which may be an interpretation for the distinct contribution of the rent of rental houses to the infection risks in this megacity. These results indicated that the potential COVID-19 situation in each megacity tended to be spatially influenced by not only current situation but also the socioeconomic conditions. In our study, much higher infection risks were very likely to occur in the zones where there were more residents and/or probably highly crowded floating people. It could be well interpreted by the characteristics of person-to-person transmission (Chan et al., 2020; Huang et al., 2020; Li et al., 2020a; Pongpirul et al., 2020; Yu et al., 2020) . The timely Table 3 Precision rates termed for the validations on Model 1, 2, and 3 in Beijing, Guangzhou, and Shenzhen. Grades Models for Beijing Models for Shenzhen ⁎ The subsequently summed communities in 6th-27th Feb and 9th-27th Feb were utilized for Model 1 and 2. Model 1, 2, and 3 were respectively validated by the dataset of infected communities in 6th-8th Feb, 9th-12th Feb, and 13th-27th Feb. Note: Models 1, 2, and 3 were respectively built with the epidemic data in 21st Jan -3rd Feb, 21st Jan -8th Feb, and 21st Jan -12th Feb. identification of the places with dense population (local residents or highly crowded floating people) should be considered as one of the critical measures for cutting off the chain transmission (Anderson et al., 2020; Phelan et al., 2020; Wu and McGoogan, 2020) . In the past days, a series of tailored infection prevention and control measures has been fully implemented at the grassroots level to isolate either the confirmed or suspected cases and to cut off the chain of transmission, resulting in remarkably reduced local COVID-19 transmission across China (without Hubei Province) (e.g., local cases below ten from February 13 in three megacities). For this point, it can be accordingly understood that the ranks of the predominant influencing factors were slightly different among Model 1-3 with the epidemic data of various timeliness. Then, we cautiously suggest that the risk zones (specific regions) of the potential COVID-19 infection and targeted intervention strategies relying on the predominant factors should be timely updated and adjusted by local hygienic authorities keeping their eyes on the ongoing epidemic. Besides, a lot of efficient infection intervention measures have shortened the time for China containing the epidemic and have provided important supports for the quick resumption of production in February and March (Tian et al., 2020; Zhao et al., 2020) . These achievements would have saved a lot of time for the other countries containing this epidemic (Heymann, 2020 ; Special Expert Group for Control of the Epidemic of Novel Coronavirus Pneumonia of the Chinese Preventive Medicine, A, 2020; Wang et al., 2020; WHO, 2020c) . Outside China, however, there has been a total of 2,234,827 confirmed cases in 212 infected countries/territories/ areas in the world to date (WHO, 2020a), in which local transmission of this disease was not well cut off. Fortunately, local transmission had been well controlled through implementing analogous measures (suspension of major events and rigorous stay-home notice) in Singapore) although the first imported case was already reported on 23rd Jan (Wong et al., 2020) . Similarly, several similar infection prevention and control strategies were also adopted in some recently infected countries, like special quarantine of Daegu and Gyeongsangbuk-do in Republic of Korea, as well as Lombardia and Veneto in Italy. Therefore, we cautiously suggest that some enhanced strategies should be strictly implemented as soon as possible to further block local transmission in the hotspot regions. Our findings are subject to several limitations. First, the identifications of the potential COVID-19 infection risk zones could be properly improved by considering more potential variables since many rigorous intervention measures had been implemented in these megacities, and the epidemiological investigations are increasingly conducted on this epidemic. Secondly, several spatial models (e.g., geographically weighted regression) should be further employed to explore the spatial heterogeneities of responses of the infection risks to socioeconomic variables, by which the strength of socioeconomic conditions influences the infection risks would also be spatially determined for the risk zones. Finally, the influences of imported cases (local residents and migrants from outer regions) should be spatially characterized through exploring their real addresses although it could be partly reflected by the spatial distribution of the rental houses. In summary, the ENM method was an effective early forecasting tool for identifying the potential COVID-19 infection risk zones and their predominant influencing factors before containing the outbreak efficiently in the megacities. We cautiously suggest that attention should be continuously paid to the ongoing epidemics, although the current infection intervention strategies have made huge contributions to the preliminary containment of this disease in China's megacities. Our study implies that these efficient strategies relying on the predominant socioeconomic influencing factors on the COVID-19 infection should be more strictly enhanced in the countries, regions, and cities with increasingly serious epidemics. Supplementary data to this article can be found online at https://doi. org/10.1016/j.scitotenv.2020.138995. No additional data available. This study was funded by the National Natural Science Foundation of China (Grant No. 41571158) The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Potential risks of Zika and chikungunya outbreaks in Brazil: a modeling study How will country-based mitigation measures influence the course of the COVID-19 epidemic Modeling the lumpy skin disease risk probability in central Zagros Mountains of Iran A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-toperson transmission: a study of a family cluster Characteristics of and public health responses to the Coronavirus disease 2019 outbreak in China Meteorological conditions, elevation and land cover as predictors for the distribution analysis of visceral leishmaniasis in Sinkiang province, mainland China Principles and practical application of the receiver-operating characteristic analysis for diagnostic tests Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts Data sharing and outbreaks: best practice exemplified A family cluster of SARS-CoV-2 infection involving 11 patients in Nanjing Early dynamics of transmission and control of COVID-19: a mathematical modelling study Game consumption and the 2019 novel coronavirus Ecological niche modeling identifies fine-scale areas at high risk of dengue fever in the pearl river delta Early transmission dynamics in Wuhan, China, of Novel Coronavirus-infected pneumonia The value of early transmission dynamic studies in emerging infectious diseases The Novel Coronavirus originating in Wuhan, China: challenges for Global Health governance Maximum entropy modeling of species geographic distributions Journey of a Thai taxi driver and Novel Coronavirus Real-time forecasts of the COVID-19 epidemic in China from Special Expert Group for Control of the Epidemic of Novel Coronavirus Pneumonia of the Chinese Preventive Medicine, A, 2020. An update on the epidemiological characteristics of novel coronavirus pneumonia(COVID-19) Early epidemiological analysis of the coronavirus disease 2019 outbreak based on crowdsourced data: a population-level observational study An investigation of transmission control measures during the first 50 days of the COVID-19 epidemic in China A novel coronavirus outbreak of global health concern COVID-19): Situation Report-91 2020 COVID-19): Situation Report-51 2020 COVID-19): Situation Report-30 2020 COVID-19 in Singapore-current experience: critical global issues that require attention and action Characteristics of and important lessons from the Coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72314 cases from the Chinese Center for Disease Control and Prevention Open access epidemiological data from the COVID-19 outbreak A familial cluster of infection associated with the 2019 novel coronavirus indicating potential person-to-person transmission during the incubation period Backtracking transmission of COVID-19 in China based on big data source, and effect of strict pandemic control policy China Novel Coronavirus, I., Research, T., 2020. A Novel Coronavirus from patients with pneumonia in China We thank all the persons involved in response to this outbreak. We also thank Miss. Lei Yan-hui for her help during the Maxent modeling in this manuscript.