key: cord-0130640-bhe8krmm authors: Wang, Peixiao; Hu, Tao; Liu, Hongqiang; Zhu, Xinyan title: Exploring the impact of under-reported cases on the COVID-19 spatiotemporal distribution using healthcare worker infection data date: 2020-11-10 journal: nan DOI: nan sha: 865f8e0243b1de8597afd62772ddc1df5f6bc8c6 doc_id: 130640 cord_uid: bhe8krmm A timely understanding of the spatiotemporal pattern and development trend of COVID-19 is critical for timely prevention and control. However, the under-reporting of cases is widespread in fields associated with public health. It is also possible to draw biased inferences and formulate inappropriate prevention and control policies if the phenomenon of under-reporting is not taken into account. Therefore, in this paper, a novel framework was proposed to explore the impact of under-reporting on COVID-19 spatiotemporal distributions, and empirical analysis was carried out using infection data of healthcare workers in Wuhan and Hubei (excluding Wuhan). The results show that (1) the lognormal distribution was the most suitable to describe the evolution of epidemic with time; (2) the estimated peak infection time of the reported cases lagged the peak infection time of the healthcare worker cases, and the estimated infection time interval of the reported cases was smaller than that of the healthcare worker cases. (3) The impact of under-reporting cases on the early stages of the pandemic was greater than that on its later stages, and the impact on the early onset area was greater than that on the late onset area. (4) Although the number of reported cases was lower than the actual number of cases, a high spatial correlation existed between the cumulatively reported cases and healthcare worker cases. The proposed framework of this study is highly extensible, and relevant researchers can use data sources from other counties to carry out similar research. determine the development trend of the epidemic da Silva Corrêa & Perl, 2021) . At present, many scholars have used reported cases data to explore the spatiotemporal characteristics of COVID-19, which includes predicting the inflection point of the disease (Shen et al., 2020; Tang et al., 2020) and detecting its spatial distribution and movement (Jia et al., 2020; Q. Liu et al., 2020; Y. Wang et al., 2021) . However, under-reporting of reported cases is a very common phenomenon in fields associated with public health, such as epidemiology and biomedicine (Fernández-Fontelo et al., 2016) . In particular, for newly detected infectious diseases such as COVID-19, under-reporting is more likely to occur due to the lack of understanding of the disease and strict diagnostic criteria (Cabaña et al., 2020) . It is also possible to draw biased inferences and formulate inappropriate urban prevention and control policies if the phenomenon of under-reporting is not taken into account. To alleviate the problem of data underreporting, the patient death data are regarded more accurate and are used to conduct research on the phenomenon of under-reporting, such as reconstructing the true epidemic level of the COVID-19 and exploring the impact of underreporting cases on mortality rate and effective reproduction number (Backer et al., 2020; Lau et al., 2021; Linton et al., 2020; Prado et al., 2020; Saberi et al., 2020) . Although the abovementioned studies analyzed the impact of under-reporting on the epidemic situation from multiple perspectives, some discrepancies still remain. First, existing methods mainly estimate the statistical characteristics of COVID-19 from the perspective of time, ignoring the impact of space on the estimation of statistical characteristics. Second, death data still needs to be accurately reported, and there may be a deviation between the number of reported deaths and the actual number of deaths due to COVID-19. Compared with patient death data, information on healthcare workers with can be considered as a more accurate method for sampling due to a smaller dataset. In the early stage of the epidemic, healthcare worker infections were more easily detected and calculated. Therefore, data on confirmed cases of healthcare workers can more accurately reflect the relevant characteristics of COVID-19 (Gao et al., 2020; Ren et al., 2021; . Considering this, a novel framework was proposed to evaluate the spatiotemporal characteristics of COVID-19 outbreak based on the infection data of healthcare workers. The main contributions of this article are as follows: (1) A novel framework was proposed to explore the impact of under-reporting on COVID-19 spatiotemporal distributions. The proposed framework is highly extensible, and relevant researchers can use data sources from other counties to carry out similar research. (2) An empirical analysis was carried out based proposed framework using infection data of healthcare workers in Wuhan and Hubei (excluding Wuhan). (3) An open-source dataset of HCW diagnoses is provided, which both ensures the reproducibility of the study and provides the data needed to support related research on data under-reporting. Studies on the spatiotemporal distributions of COVID-19 have mainly focused on reported case data and have explored the patterns and movement of the epidemic, so as to provide scientific basis for relevant measures, such as isolation and the restriction of human activities (Askarizad et al., 2021; Lin et al., 2021) . For example, used scanning statistics to detect the hotspots of new cases each week based on the confirmed cases of COVID-19 at the county level in the United States, thereby characterizing the infection rates during the epidemic. Lak et al. (2021) used spatial regression technology to explore the spatiotemporal spread pattern of 43000 confirmed covid-19 cases at the neighborhood level in Tehran, the capital of Iran. analyzed the spatiotemporal differences of the spread of the COVID-19 epidemic in 337 prefecture-level cities in China, as well as the social influencing factors and natural influencing factors. Based on mobile phone and confirmed patient data, Jia et al. (2020) developed a spatiotemporal "risk source" model to determine the geographic distribution and growth trends of COVID-19 infections to quickly and accurately assess related risks. Loske (2020) explored the relationship between COVID-19 transmission and transport volumes in food retail logistics by combining transport volume data and confirmed patient data. However, the above studies directly study the spatiotemporal distribution of COVID-19 through reported case data, ignoring the possible under-reporting of data in reported cases (Lau et al., 2021) . The conclusions obtained directly from the reported cases may deviate from the actual situation, thereby affecting the judgment of decision-makers (Bastos et al., 2021; Deo & Grover, 2021) . To alleviate the problem of data underreporting, an accurate small sample is used to explore the impact of under-reporting on the estimation of spatiotemporal characteristics (Fellows et al., 2021; Pons-Salort et al., 2021) . Additionally, it is common to estimate population characteristics using small samples in the field of statistics (Lauer et al., 2020; Smid et al., 2020) . At present, most domestic and foreign researchers regard patient death data as relatively accurate data and use it to study under-reporting (Backer et al., 2020; Fellows et al., 2021; Lau et al., 2021; Linton et al., 2020; Prado et al., 2020) . For example, Russell et al. (2020) reconstructed the early global dynamics of under-ascertained COVID-19 cases and infections. Fellows et al. (2021) estimated the incidence, mortality, and lethality rates of COVID-19 among Indigenous Peoples in the Brazilian Amazon. Li et al. (2020) estimated the infection peak time via reported cases data as well as internet search and social media data. Although the above studies considered the underreporting phenomenon, there are still some deficiencies. First, the statistical characteristics of COVID-19 are mostly estimated from the perspective of time, and impact of space on these statistical characteristics is largely ignored. Second, death data still needs to be accurately reported, and there may be a deviation between the number of reported deaths and the actual number of deaths due to COVID-19 (Whittaker et al., 2021) . Therefore, in this paper, infection data of healthcare workers is considered as a more accurate method. Compared with patient death data, healthcare worker infections were more easily detected and calculated . Additionally, we also proposed a novel framework to explore the impact of under-reporting on COVID-19 spatiotemporal distributions using an accurate small sample. Hubei Province and Wuhan City were the first provinces and cities in China in which COVID-19 was discovered. As of October 2020, the total number of confirmed cases in Hubei Province had reached 68,135, accounting for approximately 81.5% of the total cases in the population, and the total number of confirmed cases in Wuhan had reached 50,340, accounting for approximately 60.80% of the total cases. In order to explore the impact of under-reporting cases on the spatiotemporal distributions of COVID-19, the study area was divided into two parts: Wuhan City and Hubei Province (excluding Wuhan). Hubei Province is located in the central part of China, and Wuhan City is located in the central part of Hubei Province, as shown in Figure 1 . The data used in this paper included the confirmed cases at the city and county level in Hubei Province and Wuhan, respectively, and information on the healthcare workers in China obtained via retrospective analyses. The reported cases in Hubei Province were mainly obtained from the Health Commission of Hubei Province (http://www.nhc.gov.cn), and spanned the period from January 15, 2020 to March 31, 2020. The reported cases in Wuhan were mainly obtained from the Wuhan Municipal Health Commission (http://wjw.wuhan.gov.cn), and spanned the period from February 23, 2020 to March 31, 2020. Data on the confirmed COVID-19 cases of healthcare workers were mainly obtained via retrospective analyses from the Chinese Red Cross Foundation (https://www.crcf.org.cn/), which distributes relief funds to every healthcare worker suffering from COVID-19. As of September 11, 2020, 83 batches of healthcare workers had received foundation assistance. We used crawler technology to obtain 3,743 publications that elucidated the conditions of healthcare workers that suffered from COVID-19. After matching their addresses, information regarding the province, city, and county of all the affected healthcare workers were obtained. The data format is shown in Table 1 . Yang et al., 2020) . Therefore, we mainly conducted data preprocessing for information on healthcare workers suffering from COVID-19, which was obtained via retrospective analyses. The information on healthcare workers obtained via retrospective analyses is mainly reported to the Chinese Red Cross Foundation in two ways(Chinese Red Cross Foundation, 2020): (1) confirmed information on healthcare workers is directly reported by individuals to the Chinese Red Cross Foundation; and (2) confirmed information on healthcare workers is collected by their respective hospitals, which then report to the Chinese Red Cross Foundation. After examination and approval by the Chinese Red Cross Foundation, the relevant information is published on the website after which the hospital may review it. Those who fail to pass the review do not qualify to be aided. Therefore, data records of unqualified persons should be deleted from the original data. Moreover, the Chinese Red Cross Foundation not only aided infected healthcare workers, but also aided infected or diseased staff during the epidemic; Thus such data records were needed to be deleted. As shown in Figure 2a , after data preprocessing, a total of 3,703 confirmed cases of healthcare workers remained, including 3,655 from Hubei Province and 3,058 from Wuhan City. Compared with the real-time data on the confirmed COVID-19 cases of healthcare workers in the same period ( Figure 2b ) (Gao et al., 2020) , the corresponding data obtained via retrospective analyses exhibit an obvious quantitative advantage. In this study, the data on confirmed cases of healthcare workers were considered to represent an accurate and small sample space to explore the impact of under-reporting on the spatiotemporal distributions of COVID-19. The overall framework is shown in Figure 3 . First, infection inventories were constructed for the healthcare workers and reported cases in Hubei and Wuhan. Second, based on these inventories, the impacts of under-reporting cases on the temporal characteristics were analyzed from three perspectives, namely parameter estimation, temporal correlation, and temporal lag. Finally, the impacts of under-reporting cases on the spatial characteristics were analyzed from the perspective of spatial correlation and spatial lag. This work provides scientific support for researchers that explore the spatiotemporal distribution of COVID-19 using the data on reported cases. Considering the healthcare workers in Wuhan as an example, to construct the Healthcare Worker Infection Inventory in Wuhan, the Confirmed Healthcare Worker Inventory in Wuhan City was first generated. For example, the calculation method for the number of confirmed healthcare worker on day in Hongshan, Wuhan City, is shown in Equation (1): where represents the total number of confirmed healthcare worker infections in China, represents the information of a specific healthcare worker suffering from COVID-19, and _ _ represents the Confirmed Healthcare Worker Inventory in Wuhan, which elucidates the changes in confirmed healthcare worker infections in every county of Wuhan over time. To calculate the Healthcare Worker Infection Inventory, we assumed that the incubation time of SARS-Cov-2 is subject to a lognormal distribution, as seen in other acute respiratory viral infections (Lessler et al., 2009) . Lauer et al. (2020) found that the mean and standard deviation of the random variable ( ) were 1.621 and 0.418, respectively, i.e., ( )~(1.621,0.418 ). Based on the probability distribution function of the incubation time , the daily infected number of healthcare workers can be calculated. For example, the calculation method for the infected number of healthcare workers on day in Hongshan, Wuhan City, is shown in Equation (2): . . , 1 < < 13 where represents the probability that the incubation time is days; represents the maximum incubation time (as the incubation time of 98.7% patients is within 13 days, is set as 13); represents the mean of lognormal distribution, i.e. (Farrington et al., 1996; Unkel et al., 2012) . In the early stages of the epidemic, SARS-Cov-2 infections were frequent. Therefore, two samples of healthcare worker cases and reported cases were used to estimate the infection peak time and the infection time interval of the epidemic from three distributions: normal, lognormal, and gamma. The probability density function of normal, lognormal, and gamma are shown in Equations (3), (4), and (5), respectively: where and represent the fixed parameters of normal, and represent the fixed parameters of lognormal, and represent the fixed parameters of gamma, and Γ( ) represents the gamma function. In this study, the maximum likelihood estimation was used to fit the hyper parameters of the three distributions. As gamma and lognormal are skew distribution, the median of the distribution was used to approximate the peak time of infection. In addition, the cumulative distribution functions of normal, lognormal, and gamma were used to estimate the time interval of the infection. For example, the calculation method of infection time interval in normal distribution is shown in Equation (6) as follows: where ( ) represents the cumulative distribution functions of normal, ( ) represents the inverse function of ( ), represents lower bound of interval, and represents upper bound of interval. That is, the infection time interval represents the time span of an epidemic infection in 95% of patients. In this study, when the difference between the infection peak and infection interval estimated based on the two samples was small, underreporting cases had less impact on temporal characteristics, and vice versa. The Pearson's correlation coefficient is used to measure the degree of correlation between two series (Nahler, 2009) . As the ( , ) contains both temporal and spatial dimensions, we calculated the temporal and spatial correlation. As shown in Figure 5 , the spatial correlation was obtained by ℎ_ _ and _ _ , and the temporal correlation was obtained by ℎ_ _ and ℎ_ _ . The temporal and spatial correlation coefficients of the two series were calculated using Equations (7) and (8), respectively: where ℎ_ _ represents the time series of healthcare worker infections in a specific spatial region, _ _ represents the time series of reported infections in a specific spatial region, ℎ_ _ represents the spatial series of healthcare worker infections in a specific time, _ _ represents the spatial series of reported infections in a specific time, The cross-correlation function measures the impact of under-reporting cases on the spatiotemporal characteristics from the perspective of lag. As shown in Figure 6 , the crosscorrelation function can be understood as a correlation coefficient with a lag. With regard to the spatial dimension, the spatial lag of series ℎ_ _ and _ _ was calculated using Equation (9), as follows: where ( ) is the spatial correlation coefficient between series ℎ_ _ and _ _ at lag , _ _ represents the spatial series of reported infections in a specific time, ℎ_ _ represents the spatial series of healthcare worker infections lagging days compared to time . With regards to the temporal dimension, the temporal lag of series ℎ_ _ = {ℎ_ _ } and _ _ = { _ _ } was calculated using Equation (10), as follows: where ( ) is the temporal correlation coefficient between series ℎ_ _ and _ _ at lag , ℎ_ _ represents the time series of healthcare worker infections in a specific spatial region, _ _ represents the time series of reported infections lagging days compared to time in a specific spatial region. In the definition, the cross-correlation function can be regarded as a function of the lag, and the lag value that maximizes the cross-correlation function is the average delay time of actual infection. The formal definition can be seen in Equation (11): where represents the estimated delay in space, represents the estimated delay in time, the mean of ( ) and ( ) are same as those in Formulas (9) and (10). The smaller the , the smaller the impact of under-reporting cases on the spatiotemporal characteristics, and vice versa. February 3 from the data on daily reported cases. However, it was estimated to be on January 24 from the data on daily healthcare worker infection cases, i.e., with a difference of 11 days. The standard deviation estimated from daily reported cases was 7.6 days; 95% of the patients in Wuhan were infected from January 21 to February 19, i.e., within 30.0 days. However, the standard deviation estimated by the daily healthcare worker infection cases was 10.6 days; therefore, 95% of the patients in Wuhan were more likely to be infected from January 8 to February 22, i.e., within 45.6 days. Figure 9 shows the temporal correlation between the confirmed healthcare worker cases and the reported cases in Wuhan. The temporal correlation coefficient between the data on daily reported cases and new healthcare worker cases in Wuhan was 0.336, which indicated that the temporal correlation between daily reported cases and actual infection cases was weak in Wuhan. The correlation coefficient between the cumulative reported cases and the cumulative healthcare worker cases in Wuhan was 0.926. As shown in Figure 9d , although there was a strong temporal correlation between cumulative reported cases and actual infection cases in Wuhan, the distance between the trend line and the scatter points was still large in the early stage of the epidemic. This indicated the temporal correlation gradually increased over time, and the phenomenon of under-reporting would gradually decrease over time. According to Figures 9 and 10 , if the phenomenon of under-reporting is not considered, the estimated peak time of infection in the reported cases may lead to a time lag. Therefore, we quantitatively analyzed the time lag in Wuhan and Hubei (excluding Wuhan). Figure 11 In addition, we further quantitatively analyzed the time lag phenomenon of the counties in Wuhan and the cities in Hubei (excluding Wuhan). In Hubei, we calculated the time lag in the cities where the cumulative number of healthcare worker infections exceeded 10. In Wuhan, we calculated the time lag in the counties that still reported daily cases after February 21. Figure 12 shows that the time lag phenomenon has spatial heterogeneity in Wuhan and Hubei In general, the phenomenon of under-reporting has a great impact on the estimated temporal characteristics of the epidemic, and the impact in Wuhan is greater than that in Hubei (excluding Wuhan). According to the time of epidemic occurrence in different regions, the impact of under-reporting in the early onset area was greater than that in the late onset area, and the impact on the early stage was greater than that on the later stage. In order to analyze the impact of under-reporting cases on the spatial characteristics of COVID-19, we first analyzed the spatial distribution and spatial correlation of the data on reported cases and confirmed healthcare worker infections on a single time node in Wuhan and Hubei (excluding Wuhan). Then, the spatial lag of the data on reported cases and confirmed healthcare worker infections was further analyzed in Wuhan and Hubei (excluding Wuhan). Figure 13 shows the spatial distribution and spatial correlation of the healthcare worker infections and reported cases in Wuhan on March 6. According to Figures 7a and 7b , except for the Jiangxia District, the spatial distribution of healthcare worker infections and reported cases in Wuhan was quite similar, which indicates that, although the level of reported cases was less than the number of actual cases, the spatial distribution of reported cases still reflected the spatial distributions of COVID-19. Moreover, the spatial correlation coefficient between the newly reported cases and the healthcare worker cases was 0.544 in Wuhan on March 6, whereas that between the cumulative reported cases and the healthcare worker cases was 0.889. These results show that there was a certain deviation between the newly reported and healthcare worker cases in Wuhan on March 6; however, there was a high correlation between the cumulative reported cases and healthcare worker cases, which implied that the phenomenon of under-reporting had little impact on the spatial distribution of COVID-19 in Wuhan. in Hubei (excluding Wuhan) fluctuated slightly, prior to February 11. This is because in the early stages of the pandemic, the under-reporting phenomenon was more prevalent. As time passes, the phenomenon of under-reporting will be alleviated and the data on reported cases can better reflect the spatial distribution characteristics of COVID-19. Overall, the impact of the under-reporting phenomenon during the early stages of the pandemic was greater than that during its later stages, which is not only observed with regard to the temporal characteristics, but also in terms of daily new cases. In addition, although the number of reported cases was lower than the number of actual cases, the spatial distribution of the cumulative reported cases in Wuhan and Hubei (excluding Wuhan) still reflected the spatial pattern of COVID-19. Therefore, it is appropriate to use the data on cumulative reported cases to study the spatial distribution characteristics of COVID-19. Figure 16 shows the spatial lag of healthcare worker infection and reported cases in Wuhan and Hubei (excluding Wuhan). The results show that the spatial lag of cumulative cases had an insignificant effect on the spatial correlation in Wuhan and Hubei (excluding Wuhan). However, the spatial lag of daily new cases had a greater effect on spatial correlation, and the effect on Wuhan was greater than that on Hubei (excluding Wuhan). To analyze the spatial lag phenomenon further quantitatively, we fixed the lag days and averaged the corresponding spatial correlation coefficients. Overall, the impact of under-reporting cases on spatial lag was similar to the impact on spatial correlation. That is, the impact of under-reporting on the early stages of the pandemic was greater than that on its later stages, and its impact on daily new cases was greater than that on accumulative cases. For COVID-19 epidemic prevention and control, it is important to acquire a timely understanding of the spatiotemporal pattern and determine the development trend of COVID-19 in its early stage. However, under-reporting is a very common phenomenon in public health fields such as epidemiology and biomedicine. When the under-reporting phenomenon is not considered, inaccurate inferences may be produced, which will affect the judgment of decision makers (Paixão et al., 2021) . Therefore, in this paper, a novel framework was proposed to explore the impact of under-reporting on COVID-19 spatiotemporal distributions, and empirical analysis was carried out using infection data of healthcare workers in Wuhan and Hubei (excluding Wuhan). The results show that (1) the lognormal distribution was the most suitable to describe the evolution of epidemic with time; (2) the estimated peak infection time of the reported cases lagged the peak infection time of the healthcare worker cases, and the estimated infection time interval of the reported cases was smaller than that of the healthcare worker cases. (3) The impact of under-reporting cases on the early stages of the pandemic was greater than that on its later stages, and the impact on the early onset area was greater than that on the late onset area. (4) Although the number of reported cases was lower than the actual number of cases, a high spatial correlation existed between the cumulatively reported cases and healthcare worker cases. According to the results obtained from the proposed framework, the time lag phenomenon should be considered in the time characteristics inferred from the reported cases; otherwise, the urban epidemic prevention and control policies may be unreasonable. Compared with existing methods (Lau et al., 2021; Russell et al., 2020; Shen et al., 2020) , the proposed framework does not count the actual number of infections, but treats the healthcare worker infection data as more accurate data to infer the outbreak of the epidemic. In other words, the proposed framework indirectly understands the actual situation of the epidemic, which can avoid complicated calculations and make it more convenient and faster. In addition, the proposed framework of this study is highly extensible. Relevant researchers can not only use data sources from other counties to analyze the impact of under-reported cases on the spatiotemporal distributions of the COVID-19, but also use other types of data sources to analyze the impact of under-reported cases on the spatiotemporal distributions of the COVID- The limitations of this study were as follows: The proposed framework needs more datasets for evaluation. We only used healthcare worker infection data in China to explore the impact of under-reporting cases on the spatiotemporal distributions of COVID-19 and lacks data analysis from other countries. In response to the above limitations, future studies should focus on collecting further domestic and foreign healthcare worker and patient infection data to analyze the impact of under-reporting cases on the spatiotemporal distributions of COVID-19 more accurately and comprehensively. The data and codes that support the findings of this study are available in 'figshare.com' with the identifier: http://doi.org/10.6084/m9.figshare.13560455. The influence of COVID-19 on the societal mobility of urban spaces Incubation period of 2019 novel coronavirus (2019-nCoV) infections among travellers from Wuhan, China Management rules of Chinese Red Cross Foundation ByteDance healthcare workers humanitarian relief fund Global cities, hypermobility, and Covid-19. Cities, 103537 A new extension of state-space SIR model to account for Underreporting -An application to the COVID-19 transmission in California and Florida A Statistical Algorithm for the Early Detection of Outbreaks of Infectious Disease Under-Reporting of COVID-19 Cases Among Indigenous Peoples in Brazil: A New Expression of Old Inequalities Under-reported data analysis with INAR-hidden Markov chains: Under-reported data analysis with INAR-hidden Markov chains Clinical characteristics of coronavirus disease 2019 (COVID-19) in China: A systematic review and metaanalysis Geo-temporal distribution of 1,688 Chinese healthcare workers infected with COVID-19 in severe conditions-A secondary data analysis Clinical Characteristics of Coronavirus Disease 2019 in China Building an Open Resources Repository for COVID-19 Research Population flow drives spatio-temporal distribution of COVID-19 in China Spatio-temporal patterns of the COVID-19 pandemic, and place-based influential factors at the neighborhood scale in Tehran Evaluating the massive underreporting and undertesting of COVID-19 cases in multiple global epicenters The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application Incubation periods of acute respiratory viral infections: A systematic review. The Lancet Infectious Diseases Retrospective analysis of the possibility of predicting the COVID-19 outbreak from Internet searches and social media data, China Do prevention and control measures work? Evidence from the outbreak of COVID-19 in China Incubation Period and Other Epidemiological Characteristics Novel Coronavirus Infections with Right Truncation: A Statistical Analysis of Publicly Available Case Data Network analysis of population flow among major cities and its influence on COVID-19 transmission in China Spatiotemporal Patterns of COVID-19 Impact on Human Activities and Environment in Mainland China Using Nighttime Light and Air Quality Data The impact of COVID-19 on transport volume and freight capacity dynamics: An empirical analysis in German food retail logistics Pearson correlation coefficient Estimation of COVID-19 Under-Reporting in the Brazilian States Through SARI Reconstructing the COVID-19 epidemic in Delhi, India: Infection attack rate and reporting of deaths Analysis of COVID-19 underreporting in Brazil Exploring the Spatiotemporal Characteristics of COVID-19 Infections among Healthcare Workers: A Multi-Scale Perspective Reconstructing the early global dynamics of under-ascertained COVID-19 cases and infections Accounting for Underreporting in Mathematical Modeling of Transmission and Control of COVID-19 in Iran Modeling the Epidemic Trend of the 2019 Novel Coronavirus Outbreak in China Bayesian Versus Frequentist Estimation for Structural Equation Models in Small Sample Contexts: A Systematic Review Estimation of the Transmission Risk of the 2019-nCoV and Its Implication for Public Health Interventions Statistical methods for the prospective detection of infectious disease outbreaks: A review: Detection of Infectious Disease Outbreaks Spatiotemporal characteristics and factor analysis of SARS-CoV-2 infections among healthcare workers in Wuhan Temporal and spatial analysis of COVID-19 transmission in China and its influencing factors Spatiotemporal Characteristics of Epidemic in the United States Under-reporting of deaths limits our understanding of true burden of covid-19 Taking the pulse of COVID-19: A spatiotemporal perspective The authors would like to thank the anonymous referees, editor, and Prof.Shuming Bao (email: sbao@umich.edu) for their helpful comments and suggestions.