key: cord-0731840-rvdox1ir
authors: Jalali, Niloofar; Tran, N Ken; Sen, Anindya; Morita, Plinio Pelegrini
title: Identifying the Socioeconomic, Demographic, and Political Determinants of Social Mobility and Their Effects on COVID-19 Cases and Deaths: Evidence From US Counties
date: 2022-03-03
journal: JMIR Infodemiology
DOI: 10.2196/31813
sha: 045cc757faab556e2ed226f31fa65c10b97a6c5b
doc_id: 731840
cord_uid: rvdox1ir

BACKGROUND: The spread of COVID-19 at the local level is significantly impacted by population mobility. The U.S. has had extremely high per capita COVID-19 case and death rates. Efficient nonpharmaceutical interventions to control the spread of COVID-19 depend on our understanding of the determinants of public mobility. OBJECTIVE: This study used publicly available Google data and machine learning to investigate population mobility across a sample of US counties. Statistical analysis was used to examine the socioeconomic, demographic, and political determinants of mobility and the corresponding patterns of per capita COVID-19 case and death rates. METHODS: Daily Google population mobility data for 1085 US counties from March 1 to December 31, 2020, were clustered based on differences in mobility patterns using K-means clustering methods. Social mobility indicators (retail, grocery and pharmacy, workplace, and residence) were compared across clusters. Statistical differences in socioeconomic, demographic, and political variables between clusters were explored to identify determinants of mobility. Clusters were matched with daily per capita COVID-19 cases and deaths. RESULTS: Our results grouped US counties into 4 Google mobility clusters. Clusters with more population mobility had a higher percentage of the population aged 65 years and over, a greater population share of Whites with less than high school and college education, a larger percentage of the population with less than a college education, a lower percentage of the population using public transit to work, and a smaller share of voters who voted for Clinton during the 2016 presidential election. Furthermore, clusters with greater population mobility experienced a sharp increase in per capita COVID-19 case and death rates from November to December 2020. CONCLUSIONS: Republican-leaning counties that are characterized by certain demographic characteristics had higher increases in social mobility and ultimately experienced a more significant incidence of COVID-19 during the latter part of 2020.

In March 2020, COVID-19 was acknowledged by the World Health Organization (WHO) to be a global pandemic [1] . Since then, governments worldwide have implemented a series of lockdown measures intended to reduce the spread of the disease. The efficacy of these measures, in the absence of a vaccine or effective therapy, has varied across countries. Initial evidence on lockdown measures implemented in China suggested that reducing interpersonal physical contact or reducing the movement of the population is an effective means to control the spread of the virus [2] . These findings spurred national and subnational policies restricting population mobility, including social distancing (physical distancing between people who are not from the same household) [3] and stay-at-home (SAH) or shelter-in-place (SIP) orders, which required people to stay at home except for essential activities [4, 5] .

In addition to the direct impacts of such policies, evaluating the effects of demographic and socioeconomic factors on population mobility is also important as there were non-pandemic-related events that significantly impacted public movements in the U.S. after the first wave of the pandemic. Specifically, the summer of 2020 witnessed many demonstrations and public rallies in the U.S. in response to a series of events, including the death of George Floyd. Social distancing receded into the background despite rising caseloads and deaths due to COVID-19. The initial decline in public movement that occurred during the early months of the pandemic was succeeded by rapid increases in social mobility through much of the U.S. [6] . Increases in social mobility also occurred as many jurisdictions modified their SAH orders, allowed more businesses to reopen, and relaxed rules on social distancing [7] . This rise in mobility has been linked to higher COVID-19 cases in these regions [8] . Public mobility may have also increased during fall 2020 because of public rallies and social gatherings associated with the US presidential election.

A growing amount of research has used mobility data from social media platforms (Google, Twitter, and Facebook) and mobile phone providers to understand changes in mobility during the pandemic [9, 10] , the relationship between population mobility and the spread of COVID-19 cases [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] , and the effects of nonpharmaceutical interventions (NPIs) on mobility [5, 19, 20] . The consensus from these studies is that increased mobility is associated with higher COVID-19 case counts. Badr et al [15] used cell phone data from 25 counties provided by Teralytics and found that reduced mobility patterns are associated with reduced COVID-19 incidence rates. Using mobile phone data from Safegraph, Gao et al [20] similarly found that lower mobility (more time at home) is associated with a reduced spread of COVID-19 across states. Glaeser et al [19] also used Safegraph data and found reduced mobility to be correlated with lower cases for some US cities. Using Google data from different jurisdictions, other studies found a positive correlation between mobility and COVID-19 case counts [11, 12, 14, 17] . These studies are, however, limited; they investigated social mobility across a small number of US counties during the early days of the pandemic. As such, they were unable to capture socioeconomic, demographic, and political determinants of mobility [21] [22] [23] [24] [25] .

We evaluated the determinants and consequences of population movements in 1089 US counties from the start of the pandemic to December 2020. This study contributes to the literature by using clustering analysis and other tools to evaluate the impacts of different socioeconomic and demographic characteristics on social mobility in a sample of US counties. We also investigated the effects of such mobility decisions on daily per capita COVID-19 cases and deaths. Social mobility was measured through the use of Google mobility indicators at retail and recreational venues, grocery and pharmacy stores, workplaces, and residences. Robust statistical findings based on such analysis would inform policymakers in crafting efficient and effective NPIs that could curb the spread of COVID-19.

Our results demonstrate that clusters with higher mobility at retail outlets, grocery and pharmacy stores, and workplaces and a lower duration of stay at residences also have a higher percentage of population aged 65 years and over, a larger population share of Whites with less than high school and college education, a higher percentage of the population with less than a college education, a lower percentage of the population using public transit to work, and a smaller share of voters who voted for Clinton during the 2016 presidential election relative to other clusters. The clusters with higher mobility also experienced pronounced increases in per capita COVID-19 daily case and death rates from November to December 2020. These findings are consistent with other studies that suggest that Trump-leaning counties experienced increases in social mobility and less stringent policies after the first wave of the pandemic, which was succeeded by higher levels of disease severity during the latter months of 2020.

The daily numbers of confirmed cases and deaths due to COVID-19 at the county level were downloaded from the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU) [26] . For the 1089 counties in our sample, the mean (SD) of confirmed cases and deaths (both per 100,000 of population) were 1541. 27 (1905.59 ) and 33.72 (44.78), respectively. Figure 1 reveals the distribution of counties in our sample. There is a significant concentration of counties in the East, Northeast, and certain southern states. There are fewer counties from the midwestern and southwestern parts of the United States. This is because Google mobility data (discussed later) are less available for counties with lower population density. This is a limitation of our analysis. 

Data on population mobility were obtained from Google's COVID-19 Community Mobility Reports. Google creates social mobility data from users who have turned on the Location History setting of Google accounts on their phones and have agreed to share this information. Google mobility indicators are with respect to population-level daily visits to grocery and pharmacy stores, which include grocery markets, food warehouses, farmers' markets, specialty food shops, drug stores, and pharmacies; parks, which consist of local parks, national parks, public beaches, marinas, dog parks, plazas, and public gardens; transit stations, comprising subway, bus, and train stations; retail stores and recreation outlets consisting of places such as restaurants, cafes, shopping centers, theme parks, museums, libraries, and movie theaters; and workplaces. The Google mobility data also provide an index on the duration of stay at residences. Google mobility indicators for transit hubs and parks were omitted because of large numbers of missing values for the counties included in this study.

A prepandemic baseline mobility value was determined using the median mobility for each day of the week from January 3 to February 6, 2020 [27] . Subsequent mobility values were normalized to baseline. Counties with missing values less than or equal to 10% for each indicator were selected for the study. Missing values were replaced by the average from 3 prior days. The availability of Google data determined which counties we used in our analysis. The final data set contained observations for 1089 counties, which is roughly 35% of the total number of counties (N=3142) in the United States. Daily values were available for the first and second waves of the pandemic from March 11 to December 31, 2020.

With the exception of the residential index, daily values for each index were calculated relative to baseline, which was defined as the median for the corresponding day of the week, during the 5-week period from January 3 to February 6, 2020. Hence, each daily value is the percentage change in the social mobility category relative to its baseline, which shows how the number of visits to different destinations in a day have changed in percentage terms since the onset of the pandemic. The Google residential index represents the duration of stay at an individual's residence relative to the 5-week baseline. The values in this index are the percentage differences in time spent at home relative to the baseline period.

The 2016 census data were collected by the Massachusetts Institute of Technology (MIT) Election Data and Science Lab [18] . These data were supplemented by county variables collected by other studies [23, 25] . To validate that our samples were representative of all US counties, we compiled summary statistics of socioeconomic and demographic variables between our sample and all counties (Table 1 ). In summary, there did not seem to be significant differences in most variables between all counties and our sample. The exception is population, where our sample mean was more than 2.5 times that of the mean for all counties. In a similar vein, although all counties have 58% of the population in rural areas, the corresponding statistic for our sample was only approximately 31%. These discrepancies can be explained by the fact that Google's social mobility indicators are only available for counties with larger populations that are more densely populated. This is consistent with the visualization of counties in our sample from Figure 1 . To compress the multidimensional time series, we implemented the variational autoencoder (VAE) architecture based on long short-term memory (LSTM) [29] [30] [31] . The principal concept of this generative approach is to project high-dimensional data into latent variables. Our model comprised 4 blocks [32]:

1. Encoder: Defined by the LSTM layers, the multidimensional time series input (x) are fed into the LSTM. 2. Encoder to latent layer: Defined by a linear layer, which identifies the mean and SDs of the last hidden layer of the encoder. During the training process, the multigaussian distributions are defined and reparametrized iteratively by the mean and SDs derived from latent vectors. 3. Latent layer to decoder layer: The latent variables (z) are sampled from the distribution and pass through a linear layer to identify the decoder input. 4. Decoder: Defined by the LSTM layers, which uses latent variables (z) to reconstruct the original data [33] .

Identifying the true posterior distribution is intractable [33] . Therefore, to construct the original data, the probabilistic encoder model was approximated by normal distribution p(z|x)N(0,1) and used as a probability decoder [30, 33] . Hence, the reconstruction of input was defined by sampling from the distribution of latent variables (z).

To evaluate the performance of the model, the loss function was defined as follows:

• The divergence from the approximated distribution and the true distribution

The mean squared error loss calculated the difference between original and reconstructed input data

The total loss is defined as sum of 2 losses:

The model was trained in Python 3.6 using the Keras library [34] with the Adam optimizer. The batch size and number of the epochs were set to 10 and 100, respectively. The number of nodes for encoder and decoder hidden layers was set to 500. The dimensionality of latent variables was set to 3. We also implemented the L1 and L2 regularizers to avoid overfitting.

To evaluate the performance of the model, the VAE total loss was used to identify the reconstruction error between encoder input and decoder output.

Once the model was trained and the encoder, decoder, and VAE were constructed, the output of the encoder model was selected as the representation of the multidimensional patterns of each county. K-means clustering was used to identify the similar segmentation of the counties. To identify the optimum number of clusters as well as the homogeneity of data points within each cluster, the elbow method [35] and the silhouette score [36] were used.

To compare the socioeconomic characteristics of the counties in each cluster, the 2016 MIT election data were used as input, while the classes were the cluster labels. The data were divided into training and testing sets with a 70:30 split, respectively. The random forest classifier [37] with 10 k-fold cross-validations was used to build the predictive models. The area under the curve (AUC) of the model was calculated, and the most important features associated with the cluster numbers were defined as the parameters describing the characteristics of counties in each cluster. Feature scores of different census variables for the clusters were computed, which yielded an idea of the relative importance of different socioeconomic and demographic factors for explaining the different clusters. Figure  3 summarizes our approach. 

This study leveraged a partitioning-based deep learning model to cluster counties based on similarities in social mobility. For each county included in this study, data were clustered based on a combination of the daily values of the 4 Google mobility indicators (retail, grocery and pharmacy, workplace, and residence). The multidimensional time series of Google social mobility indicators from 1089 counties was divided into training and testing sets and fed into the VAE model. The result demonstrated a loss of 0.08. The latent variables were extracted as the output of the encoder. The K-means clustering algorithm identified 4 social mobility clusters. The number of counties in these clusters, which were termed as 0, 1, 2, and 3, were 215, 338, 473, and 59, respectively. Figure 4 gives the distortion scores of the K-means clustering. 

Across all clusters, visits to retail stores fell significantly after the start of the pandemic until around mid-April, followed by a steady increase and plateauing in early July ( Figure 5 ). Visits to retail outlets began to decline again in late September but then began an upward trend starting on Thanksgiving weekend until the end of December. Retail social mobility values were the highest for cluster 0, followed by clusters 2 and 1, with cluster 3 having the lowest social mobility. Grocery and pharmacy mobility trends reflected those seen for retail social movements but were less pronounced ( Figure 6 ). Cluster 0 had the highest values of grocery mobility, followed by clusters 2, 1, and 3. Workplace mobility showed an initial decline at the start of the pandemic, followed by a steady increase from early May onward (Figure 7) . Spikes in mobility were observed during the weekend, which did not significantly decline relative to prepandemic observations. County clusters followed the same order, with cluster 0 having the greatest mobility, followed by clusters 2, 1, and 3. Finally, residential mobility followed a reverse pattern relative to the other indicators, with cluster 3 having the highest mobility, followed by clusters 1, 2, and 0 ( Figure 8 ). Residential mobility was highest during the onset of the pandemic, followed by a decreasing trend during spring and summer. From late September onward, residential mobility began to increase, and this trend continued until the end of the sample period. The spikes in mobility captured the weekend effects. Our social mobility data indicated differences in mobility between clusters, with counties in cluster 0 having the highest retail, grocery, and workplace mobility and the lowest residential mobility. In contrast, counties in cluster 3 had the lowest social mobility and the highest residential mobility. 

To determine whether county characteristics are correlated with differences in social mobility between the clusters, we obtained socioeconomic, demographic, and political data from each county from 2016 census data [18] . These data included 2016 election returns, race, median income, total population, percentage of rural areas, and education level of the population for age and race. These data were supplemented by county variables collected by other studies [23, 25] .

A random forest classifier was used to generate feature scores of different socioeconomic and demographic characteristics of the counties included in each cluster, across all 4 clusters (mean receiver operating characteristic [ROC] AUC 0.871). Table 2 contains the feature scores of all county-level variables.

The top 10 variables in terms of feature scores were percentage of the population aged 65 years and over (0.41715), percentage of females (0.08784), percentage of Whites (0.03869), percentage of Whites with less than college education (0.03772), percentage of Hispanics (0.03369), percentage of Whites with less than high school education (0.03178), percentage of the population using public transit (0.02967), county unemployment rate (0.02759), proportion of voters for Clinton in 2016 (0.02737), and percentage of the population with less than high school population (0.02719). Hence, although political preference and population composition were important, it is important to note the significance of 3 educational variables among the top 10, with the percentage of the population with less than college education being the 11th variable in terms of feature score.

To explore the top 11 socioeconomic, demographic, and political variables impacting social mobility further, we determined the mean population percentage for each county-level variable across clusters (Table 3 ). The table also contains results of statistical tests of significance of sample means between clusters. The Z test of sample means was performed to compare the significance of different county-level variables for different clusters. Results demonstrated several variable similarities for clusters with the highest social mobility. The percentage of the population aged 65 years and over, Whites, the percentage of whites with less than high school and college education, and the percentage of the overall population with less than college education were higher in counties defined by clusters 0 and 2.

Tests of equality of sample proportions and means confirmed that there was a statistically significant difference between clusters 0 and 2 versus clusters 1 and 3 for these population variables. In contrast, the percentage of Hispanics, percentage of the population using public transit for work, and percentage voting for Clinton in 2016 were lower in clusters 0 and 2 relative to clusters 1 and 3. There was no consistent, significant difference across clusters for the percentage of females, population with less than high school education, and unemployment rates. 

Given that policies restricting population mobility were established to curb the spread of COVID-19, we sought to determine whether county clusters with higher social mobility indicators (clusters 0 and 2) reported elevated viral cases and deaths. The daily number of confirmed cases and deaths due to COVID-19 at the county level was obtained from the CSSE at the JHU. We determined the median daily per capita cases ( Figure 9 ) and deaths ( Figure 10 ) by cluster. During the first months of the pandemic, per capita daily cases were quite comparable across clusters (Figure 9 ). There was a visible divergence that occurred at the beginning of October (onset of the second pandemic wave), with daily cases rising sharply in clusters 0, 1, and 2 relative to cluster 3. For the remainder of the period examined, cluster 0 had the highest number of daily cases, followed by clusters 2 and 1. Cluster 3 retained relatively lower daily cases. Interestingly, clusters 0 and 2 had lower daily deaths until the beginning of September ( Figure 10 ). Daily deaths in these clusters then increased rapidly, and by the beginning of October, per capita deaths in clusters 0, 1, and 2 were higher than in cluster 3. 

This study aimed to assess the effect of county-level characteristics on population mobility and the consequences of this mobility on the spread of COVID-19. To the best of our knowledge, this is the first study that has used unsupervised machine learning to understand differences in population mobility across US counties during the first and second waves of the pandemic and determine the relative importance of a wide array of socioeconomic, demographic, and political variables in defining different mobility-based clusters.

Our results demonstrate that of the 4 clusters defined by Google social mobility indicators, the clusters with higher retail, grocery, and work mobility (and lower residential mobility) had several similar population characteristics. Specifically, counties with greater social mobility also had a higher percentage of the population aged 65 years and over, Whites with less than high school and college education, and overall population with less than college education. Counties in these 2 clusters also had a lower share of the population that is Hispanic, the percentage of the population using public transit to work, and the share of voters who voted for Clinton during the 2016 presidential election. Research does suggest that Whites with less than college education constituted a significant voting block for Trump during the 2016 election [38] . In line with this, the 2 clusters with the greatest social mobility also experienced higher per capita COVID-19 case and death rates during most of November and December 2020. These results are consistent with Xie and Li [39] , who also used county-level data during the early days of the pandemic and found lower education levels to be correlated with higher infection rates.

The significant increase in COVID-19 cases and deaths in clusters 0 and 2 during November and December 2020 could be a consequence of public rallies and general disregard for social distancing and safety protocols by pro-Trump voters [40] .

Although we cannot prove this, the majority of counties in these clusters were Republican leaning during the 2016 presidential election. Moreover, our finding of higher per capita daily COVID-19 cases and deaths in such counties is consistent with other studies. Desmet and Wacziarg [41] found that early on during the pandemic, Republican counties actually experienced lower COVID-19 cases and therefore had lax attitudes toward mask wearing, social distancing, and lockdown measures. However, as the pandemic spread to Trump-leaning counties, population preferences for less stringent social distancing policies had already been formed, making it difficult for policymakers to implement stricter restrictions on social mobility. As a result, this led to greater disease severity in Trump-leaning counties. In a similar vein, Allcott et al [42] found that areas with more Republicans engaged in less social distancing, controlling for other factors, including public policies. In summary, these findings corroborate our own results. Social mobility in the aftermath of the first wave of the pandemic was much higher in Republican counties, which ultimately resulted in higher COVID-19 cases and associated deaths relative to other counties that were Democrat leaning.

Social media is increasingly being used to capture population movements and understand their corresponding impacts on COVID-19 incidence. Social media-based data, including those presented here, have some limitations. Specifically, there is the possibility of sample selection bias if Google Maps users have specific demographic characteristics and are not distributed uniformly across the population. However, data from Statista indicate that in the U.S., Google Maps had 154 million users in April 2018 [43] . Further, published research has done a comparison of Google mobility data against corresponding cellular-generated information by other providers and has found a close correspondence. Specifically, Szocska et al [44] constructed a mobility index and an SAH/resting index based on data on almost all phone subscribers in Hungary and found a close correlation with corresponding Google mobility indices at the national level. There are also a significant number of published studies that have used Google mobility data to capture population movements for different countries and have found them to be important in predicting movements in COVID-19 (Bryant and Elofsson [11] , Askitas et al [45] , and Stevens et al [46] ). For these reasons, we think there is a high likelihood that Google mobility data do reflect population movements. However, Google mobility data do not include information on certain types of public movements, such as election rallies or community gatherings.

Our research demonstrates the usefulness of publicly available Google mobility data and unsupervised machine learning methods in establishing relationships between county-level characteristics, mobility decisions, and COVID-19 incidence. These findings have important implications for policymakers and public health officials in understanding the effects of NPIs, as the efficacy of such measures on mobility is influenced by underlying socioeconomic, demographic, and political ideology characteristics. The use of Google data enables researchers to assess the types of public movements that are most contributory to COVID-19 spread.

The results of this study provide a unique lens on the potential of machine learning to understand social mobility behaviors. These findings are critical for public health organizations trying to understand the levels of mobility in their counties, in addition to providing insights into some of the underlying factors (ie, social determinants of health) contributing to regional differences in COVID-19 caseloads.

Our results emphasize a role for machine learning methods in public health. Publicly available Google data, in conjunction with census data, can be used to understand the socioeconomic, demographic, and political determinants driving population mobility choices across US counties. This knowledge can assist policymakers in developing NPIs to restrict viral spread during the COVID-19 pandemic.

WHO declares COVID-19 a pandemic

Report of the WHO-China Joint Mission on Coronavirus Disease

When do shelter-in-place orders fight COVID-19 best? Policy heterogeneity across states and adoption time

Summer of Digital Protest: How 2020 Became the Summer of Activism Both Online and Offline

See Reopening Plans and Mask Mandates for All 50 States

Mobile device data reveal the dynamics in a positive relationship between human mobility and COVID-19 infections

Use of Twitter social media activity as a proxy for human mobility to predict the spatiotemporal spread of COVID-19 at global scale. Geospat Health

Monitoring the spatial spread of COVID-19 and effectiveness of control measures through human movement data: proposal for a predictive model using big data analytics

Estimating the impact of mobility patterns on COVID-19 infection rates in 11 European countries

Community movement and COVID-19: a global study using Google's Community Mobility Reports

Geo-located Twitter as proxy for global mobility patterns

Using a partial differential equation with Google Mobility data to predict COVID-19 in Arizona

Association between mobility patterns and COVID-19 transmission in the USA: a mathematical modelling study

Associations between phone mobility data and COVID-19 cases

Face masks, public policies and slowing the spread of COVID-19: Evidence from Canada

Examining the change of human mobility adherent to social restriction policies and its effect on COVID-19 cases in Australia

JUE insight: how much does COVID-19 increase with mobility? Evidence from New York and four other U.S. cities

Association of mobile phone location data indications of travel and stay-at-home mandates with COVID-19 infection rates in the US

Demographic, jurisdictional, and spatial effects on social distancing in the United States during the COVID-19 pandemic

Political partisanship influences behavioral responses to governors' recommendations for COVID-19 prevention in the United States

What Does and Does Not Correlate with COVID-19 Death Rates

Racial Disparity in COVID-19 Deaths: Seeking Economic Roots with Census Data

Socioeconomic Covariates of Behavioral Responses and Viral Outcomes Across US Counties (Report No.: w27549)

COVID-19 Dashboard

MEDSL/2018-elections-unoffical

Auto-encoder Based Data Clustering

HVGH: unsupervised segmentation for high-dimensional time series using deep neural compression and statistical generative model

Understanding user behavior through the use of unsupervised anomaly detection: proof of concept using Internet of Things smart home thermostat data for improving public health surveillance

Unsupervised classification of multi-omics data during cardiac remodeling using deep learning

Variational Recurrent Autoencoder for Timeseries Clustering in Pytorch

The optimally designed variational autoencoder networks for clustering and recovery of incomplete multimedia data. Sensors (Basel)

Keras: The Python Deep Learning API

Using the Elbow Method to Determine the Optimal Number of Clusters for K-means Clustering

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis

Identifying Feature Relevance using a Random Forest

JMIR Infodemiology 2022 |

An Examination of the 2016 Electorate, Based on Validated Voters

Health and Demographic Impact on COVID-19 Infection and Mortality in US Counties

Trump Supporter on Not Wearing a Mask: It's a Fake Pandemic

Understanding spatial variation in COVID-19 across the United States

Polarization and public health: partisan differences in social distancing during the coronavirus pandemic

Countrywide population movement monitoring using mobile devices generated (big) data during the COVID-19 crisis

Estimating worldwide effects of non-pharmaceutical interventions on COVID-19 incidence and population mobility patterns using a multiple-event study

Estimating the effects of non-pharmaceutical interventions and population mobility on daily COVID-19 cases: evidence from Ontario

The authors thank Caitlin S Brown, Chris Knittel, and Bora Ozaltun for kindly sharing their data.

None declared.

©Niloofar Jalali, N Ken Tran, Anindya Sen, Plinio Pelegrini Morita. Originally published in JMIR Infodemiology (https://infodemiology.jmir.org), 03.03.2022. This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Infodemiology, is properly cited. The complete bibliographic information, a link to the original publication on https://infodemiology.jmir.org/, as well as this copyright and license information must be included.