key: cord-0940106-6rfcrrwa
authors: Kariburyo, Mohamed Shabani; Andress, Lauri; Collins, Alan; Kinder, Paul
title: Place Effects and Chronic Disease Rates in a Rural State: Evidence from a Triangulation of Methods
date: 2020-09-14
journal: Int J Environ Res Public Health
DOI: 10.3390/ijerph17186676
sha: e9b7f75b1719397cfb7b8a85ae7c3049f902c9e0
doc_id: 940106
cord_uid: 6rfcrrwa

High rates of chronic diseases and increasing nutritional polarization between different income groups in the United States are issues of concern to policymakers and public health officials. Spatial differences in access to food are mainly blamed as the cause for these nutritional inequalities. This study first detected hot and cold spots of food providers in West Virginia and then used those places in a quasi-experimental method (entropy balancing) to study the effects of those places on diabetes and obesity rates. We found that although hot spots have lower rates of chronic diseases than non-hot spots and cold spots have higher rates of chronic diseases than non-cold spots—the situation is complicated. With the findings of income induced chronic disease rates in urban areas, where most hot spots are located, there is evidence of another case for "food swamps." However, in cold spots which are located mainly in rural areas, higher rates of chronic diseases are attributed to a combination of access to food providers along with lacking the means (i.e., income for low-income households) to form healthier habits.

Poor diet is a modifiable risk factor for obesity and diabetes. Contributing to substandard nutrition, the food environment is also thought to be a primary driver of poor diet. Health advocates, community leaders, and researchers are worried that food environment problems, and poor diets in general, may be more severe in certain low income and rural American communities because these areas have limited access to affordable and nutritious foods. A primary concern is that some poor or rural areas do not have access to supermarkets, grocery stores, or other food retailers that offer the large variety of foods needed for a healthy diet (for example, fresh fruits and vegetables, whole grains, fresh dairy, and meat products) [1] [2] [3] [4] . Instead, individuals in these areas may be more reliant on food retailers or fast food restaurants that only offer more limited varieties of foods. It is hypothesized that the relative lack of access to full-service grocery stores and the easier access to fast and convenience foods may be linked to poor diets, and ultimately, to obesity and other diet-related diseases [5, 6] .

It was this concern that led Congress, in the Food, Conservation, and Energy Act of 2008, (hereafter referred to as the 2008 Farm Bill) to instruct the U.S. Department of Agriculture (USDA) to conduct a 1-year study assessing the extent of the problem of "food deserts". More specifically, the remit in the 2008 Farm Bill called upon the USDA to study the limited access to food, characteristics, and causes of limited access, the effects limited access has on local populations and recommendations for addressing the causes and effects of limited access. As a result, numerous researchers have tried to empirically measure the association between the food environment and obesity and diabetes during this past decade. These studies have investigated whether living in a "food desert" or "food swamp" caused higher rates of these chronic diseases or increased nutritional inequality between the wealthier and the less fortunate population [7] [8] [9] . Food deserts are often described as residential areas with limited vendors of affordable and healthy foods [8] . In comparison, food swamps are areas with a higher number of unhealthy vendors, such as convenience stores and fast food restaurants [10] . Despite this surge of researchers, most studies still failed to prove the causal claim that underserved areas are associated with an increased risk of chronic diseases, with few exceptions [8, 10] . Nonetheless, federal and local governments continue to support the supply-side through initiatives such as the Healthy Food Financing Initiative and the Agricultural Act of 2014. (Since 2011, the healthy food financing initiative has been providing subsidies to grocery stores, farmers markets, and other actors working in underserved areas. The Agricultural Act of 2014 allocated about 125 millions in federal funds to support access to healthy food in underserved areas [9] .) It is in this context that some researchers have started to question the metaphor of a "food desert" and its applicability in different geographical entities in the United States. This argument states that studies in this field should not just measure the food environment as binary-areas where people live in a food deserts or food swamps, and areas where people do not live in these regions. Instead, it is important to consider how locals perceive their food environment [2, 11] . This is a way to include the spatially-explicit factors that individuals have to face on a daily basis. Furthermore, the food environment is believed to be different in urban and rural areas-what is considered a "food swamp" in one area might be a "food oasis" in another [12] . Therefore, to understand the impact of the food environment on chronic diseases, it is crucial to properly define the food environment spatial measure.

In this research, we explicitly account for spatial drivers that may influence the supply of food vendors and study how these places might impact diabetes and obesity rates in West Virginia. This state is an interesting case to study. It leads the nation in obesity rates and is ranked the second-highest in diabetes rates nationally (based on 2018 statistics provided by the West Virginia Department of Health and Human Resources Bureau for Public Health (https://dhhr.wv.gov/hpcd/data_reports/pages/ fast-facts.aspx)). This study will add to efforts that evaluate and understand how the neighborhood retail food environment contributes to these chronic diseases. In economics and public health literature, this paper is related to articles that studied the causal link between the food environment and chronic diseases ( [8] [9] [10] [11] [13] [14] [15] ). We combine the hot spot analysis results with a quasi-experimental approach (entropy balancing) to study how diabetes and obesity rates might be impacted by these places (hot and cold spots). To our knowledge, this is the first study to combine results from the hot spot analysis and a quasi-experimental approach to address this issue. First, we wanted to test whether places with an abundance of food providers (hot spots) and those that lack food providers (cold spots) are associated differently with diet-related diseases. Second, we wanted to understand whether the difference in these chronic diseases rates between hot (and cold) spots depends on whether a census tract is urban or rural. Finally, we wanted to test whether there is a significant diet-related disease-income relationship between hot (and cold) spots.

The remainder of the article is organized as follows: in Section 2, a literature review on chronic diseases and place-based policies is elaborated; Section 3 explains the methods and the data used in this study; Section 4 presents the results; Section 5 provides some discussion and policy implications; finally, Section 6 discusses about the limitations of this research.

Studies associating chronic diseases and the overall food environment across different geographical regions have been surging in economics and public health fields. These studies consider whether neighborhood food vendors play a role in nutritional inequality, and therefore, chronic diseases.

The literature behind the associations between healthy and unhealthy food outlets and chronic diseases has heavily relied on spatial analysis or geographic information system data ( [5, 9, 14] ). To examine the association between food deserts (and food swamps) and consumption of snacks/desserts or fruits/vegetables, Hager et al. combined geocoded home addresses and maps of food deserts/food swamps [16] . They found that adolescent girls living in an unhealthy environment (food swamps) had a higher intake of snacks/desserts than girls not living in these areas. Further tackling the issue of food swamps and food deserts, Cooksey-Stowers et al. combined commercial street reference, socioeconomic, food environment, and obesity datasets [8] . Their results suggest that food swamps are associated with higher obesity rates, but the absence of a full grocery store shows inconclusive results. Jang et al. investigated the intersectionality of race/ethnicity and poverty in terms of spatial access to supermarkets, grocery stores, and convenience stores in the Detroit metropolitan area [17] . They suggest that place-based policies can successfully improve health intervention strategies when implemented by federal, state, and local authorities.

By further examining nutritional inequalities in food swamps and food deserts, Phillips et al. found that the association between hospitalization rates among adults and diabetes followed a curvilinear relationship, plateauing at the saturation point of unhealthy food stores [10] . Using a measure of diabetes incidence, Christine et al. showed that having resources that support physical activity and healthy diets in the neighborhood can lower the incidence of type 2 diabetes [18] . In most research, the definition of chronic disease measures is clearly specified with the association between food environment and diabetes incidence [18, 19] being analyzed, and the relationship between food swamps and a measure of severity being treated in [10] .

There are also recent articles, using large, nationally representative samples that have started to question the hypothesis that the neighborhood food environment (i.e., supply side factor) reduces nutrition inequality. Allcott et al. argued that increasing the supply of healthy groceries alone does not reduce nutritional disparities ( [9, 20] ). Using counterfactual simulations, they demonstrated that simply fixing food deserts by adding stores or responding only to supply-side issues achieves merely a ten percent reduction in nutritional inequality. Alternatively, the researchers assert that the remaining ninety percent difference in nutritional inequality might be driven by any number of demand-side factors in their simulations, including education, prices, and/or preferences [9] .

Johnson et al. highlighted a vital trend in the neighborhood food environment literature-most research on this subject tends to be on urban communities and rural communities are mainly left behind [21] . Bailey sounded the alarm on how traditional grocery stores are becoming scarce in rural America [22] . Using a cross-sectional study, Sharkey showed that in Texas border colonia households, convenience stores were the critical variable that influenced nutrients intakes [12] . The demand for food in rural America relies on many different healthy and unhealthy stores due to the shortages in the supply. Some households depend on dollar stores, big box retailers (e.g., Walmart), convenience stores (e.g., Sheetz), fast food restaurants, and farmers markets [12, 21] . Understanding places might be the missing point in solving nutritional and health disparities in these communities. Johnson et al. focused on rural areas and identified determinants related to access to healthy and affordable food to improve population health and reduce health inequalities in rural America [21] .

Further, studies focusing on either rural or urban areas may differ in findings because of the study area and empirical methodologies. Alviola et al. found that fast-food restaurants and convenience stores were the main determinants of food deserts in a rural state like Arkansas [15] . Allcott et al. demonstrated how the addition of a healthy store in certain geographical areas does little in reducing nutritional inequalities. Their national representative data helped them advance some policy implications that could help federal, state, and city officials in their intervention strategies [9] .

Overall, much of the previous research has highlighted the need to provide more evidence on the effect of supply inequities on chronic diseases. The lack of causal effect studies that try to understand the association between chronic diseases and food environment leaves many unanswered questions on this matter. By evaluating how places in a largely rural state impact diabetes and obesity rates, our research aims to pinpoint areas that are at risk of chronic diseases.

To understand the relationship between West Virginia's food environment (place effects) and chronic diseases, this research focuses on food provider geographic locations with the caveat that while other factors impacting food access certainly exist, they are beyond the scope of this research. Our specific approach to this issue offers a triangulation of methods where we combine results from a spatial analysis method (hot spot analysis) with a matching technique (entropy balancing), and an ordinary least square (OLS) regression to understand whether the extent of food providers' presence and/or absence impacts chronic diseases rates in West Virginia. Recent literature reviews suggest that this combination of methodologies to identify the effects of retail food providers on chronic disease rate incidence constitutes a new approach in this literature [23] . (Reference [23] utilized the same logic while studying the economic impact of organic agriculture hot spots in the United States. The main methodological difference between the two studies is that [23] computed the hot spot analysis then used the propensity score matching (PSM), whereas our study used the entropy balancing method and perform a sensitivity analysis using the propensity score weighting and the kernel matching method.)

To capture the geographic distribution and patterns of food stores in West Virginia, this research utilizes a dataset from the WV Foodlink project (http://foodlink.wvu.edu/) housed at the food justice laboratory within the Department of Geology and Geography at West Virginia University. In collaboration with their community partners, this laboratory has developed a resource hub and learning commons to support a people-centered, resilient food network in West Virginia. These data contain the geolocation of charitable assistance agencies, big box retailers, grocery stores, small box retailers, convenience stores, farmers markets, and undefined SNAP accepting retailers. In this data, we count 68 big box stores, 996 convenience stores, 134 farmers markets, 320 grocery stores, and 587 small box retailers. Toward the end of 2015, the WV Foodlink project verified the validity of these locations by sending participants from this project to the specific site of a food store (or outlet) to confirm their actual latitude and longitude. In this research, food stores geographic coordinates were used as a basis in all spatial analysis conducted.

In the spatial analysis practice, researchers usually use two methods to map clusters: heat maps and hot spot analysis. The heat map as illustrated in Figure A2 utilises a color gradient that indicates areas of increasingly higher density. In our map, food vendors tend to cluster around West Virginia larger cities-and it is very difficult to visualize where the population is lacking food vendors-which is why we focus on the hot spot analysis in this research. Hot spot analysis is a statistical method that assesses geographic clustering. This method is a local indicator of spatial autocorrelations (LISAs)-it uses variations in local spatial autocorrelation compared to surrounding areas, or areas in which clusters of density or intensity are statistically distinct from the neighboring landscape-an option that a heat map does not offer. Figure A1 in the appendix displays the specific latitudes and longitudes of all the stores in West Virginia. Each point, as described in the legend, represents a specific type of store. Separating between healthy and unhealthy food suppliers is one option that this research could have followed; however, in some areas in West Virginia, families rely on stores that are considered unhealthy to purchase all their fresh produce. (According to the American Healthy Food Financing, West Virginia is facing a growing problem of store closures-leaving behind thousands of families. As a result, families rely more and more on closer stores such as convenience stores or Mom and Pop stores; see https://www.investinginfood.com/commentary-w-v-projects-try-new-ways-to-deliver-freshfood-options/.) To account for these inequities, this research defines a hot spot as an area which has a much higher than average count of food providers, and a cold spot as an area which has a much lower than average count of food providers. (Clusters of stores whose values are atypically high or low compared to the entire sample. Given a chosen level of statistical significance (e.g., p < 0.001), the hypothesis test identifies high (low) value observations surrounded by other high (low) value observations, where the difference between values observed for these identified clusters and those for the surrounding observations is too great to be the result of random chance. For example, Kanawha county, which is in a hot spot, has 30 grocery stores, 5 big box stores, 80 small box stores, 96 convenience stores, and 1 farmers market. Pocahontas county, a cold spot in our analysis, has 7 grocery stores, 0 big box store, 4 small box stores, 1 convenience store, and 0 farmers markets.) Optimized hot spot analysis (for more details on this method please see the ArcMap manual: https://desktop.arcgis.com) is used as a spatial data mining tool that asks the data (in our case, latitude and longitude of each specific store) to obtain the settings that yield the optimal hot spot results. It uses the Getis-Ord G i * which works as follows:

where x j is the attribute value for food provider j; w i,j is the spatial weight between food providers i and j. The choice of a weight matrix is a very important step during the hot spot analysis and in our case, we use an inverse distance weight matrix. This distance-based weight matrix with a distance band of 10 miles between food providers because the USDA defines low food access census tracts as those who have 500 persons and/or at least 33 percent of the population who live more than 10 miles from a supermarket or a large grocery store (https://www.ers.usda.gov/data-products/food-accessresearch-atlas/documentation/). Using this type of distance-based weight matrix, a food provider that is farther away, receives less weight in the analysis. n is equal to the total number of features and X bar is just an average of n attribute values across West Virginia. Lastly, Equation (2) is the square root of the average of the squares of the sum of all attributes minus the average attribute squared.

The G * i statistic can be thought of as a z-score so that no further statistical calculation is required. Figure 1 clearly shows the highly uneven food environment landscape in West Virginia. The most common stores in West Virginia are convenience and small box retailers which make up about 76 percent of the retailers in the state. This unevenness was the reason why it was important to row standardize this matrix to give each feature the same weight in the analysis. Figure 2 shows a map depicting the results of the hot spot analysis. Among the 484 census tracts in West Virginia, those red areas show census tracts that are hot spots of food providers at a 99 percent confidence interval. The light gray areas are census tracts that represent cold spots at a 90 percent confidence interval. Every statistically significant hot or cold spot area was reported (Figure 2 displays a legend with the level of statistical significance at the 90, 95, and 99 percent; we used these visual explanations to designate a census tract as hot or cold spots).

Using the results from this map, a census tract level indicator variable is created, which takes a value of 1 if the census tract is identified as being part of a statistically significant hot or cold spot, and 0 otherwise. To examine the impact of hot and cold spots on chronic diseases, this research characterizes hot and cold spots as being "treatments". It measures the impact of the treatment on a census tract diabetes and obesity rates. 

A big challenge of this empirical work is to establish an econometric link between the chronic diseases of diabetes and obesity and those places identified as hot or cold spots for food providers. Several endogenous variables might lead a census tract to being identified as a hot or cold spot. To mitigate the potential endogeneity issue that might arise with being in a hot spot or cold spot, we employ a matching technique.

Since the analysis is based on the concept that a hot or cold spot represents a treatment-our measure of interest is the average treatment effect on the treated ATT, which is defined as:

where D, is an indicator variable that is equal to 1 if a census tract is a hot (or cold) spot and 0 otherwise. For a specific census tract, we denote E[y 0 /D = 1] as the counterfactual-that is, the outcome of a census tract in a hot (or cold) spot would have been realized had they not been positioned in those geographical locations; and E[y 1 /D = 1] as the outcome after being treated as a hot (or cold) spot. The counterfactual is unobservable in this case; therefore, a suitable proxy to solve for ATT is needed.

In the case of a randomized control trial, the average outcome of units not exposed to treatment, E[y 0 /D = 0], is a proper substitute. However, as discussed before, being in a hot or cold spot, and thus, selection into treatment, could be endogenous [24, 25] . The matching process's key idea is to facilitate a situation of randomization where the assignment to the treatment can happen by a chance procedure. The unobserved counterfactual outcome is assigned by matching the treated units with untreated units that are as similar as possible with regard to all pretreatment covariates that are associated with selection into treatment (being in a hot or cold spot) and impact the dependent of variables of interest (diabetes or obesity rates). The estimate representing ATT based on matching can be written as follows:

x is a vector of relevant covariates, which are described in the following subsection, E[y 1 /D = 1, X = x] is the expected outcome for the census tracts that were in the hot (or cold) spots, and E[y 0 /D = 0, X = x] is the expected outcome for the best matches of the hot (or cold) spots census tracts [26] .

Studies that have investigated the effectiveness of place-based policies have used nearest neighbor matching, propensity score matching, Mahalanobis distance matching, and entropy balancing [27, 28] . Marcus et al. argued that because of the sensitivity to the choice of matching variables, entropy balancing balances the matching covariates more effectively than the propensity score matching [24] . Furthermore, Deb et al. elaborated a summary of the main difference between entropy balancing and the propensity score matching: the former does not require the researcher to first estimate the weights, and then verify whether the treatment and the control covariates balance like the latter does [29] . Hainmueller showed that entropy balancing is better suited than either the propensity score or Mahalanobis distance matching. Entropy balancing generates a weighting scheme that reweights control observations so that they are perfectly balanced compared to treated individuals based on the mean moments of covariate distribution. After weighting, the treated indicator is orthogonal to covariate moments included in the entropy balancing exercise. Finally, the weights are then incorporated in Equation (5) as a survey weight (one can think about these survey weights or weights as similar to those that are created by the logit model of the propensity score analysis [25] ):

where Y i is the rate of diabetes or obesity; and x i is a set of independent variables that includes whether a census tract is considered in treatment (hot spots and cold spots), an urban versus rural designation for each census tract, mean household income, and the proportions of the population that are employed in the sectors of agriculture, fisheries, hunting, and mining (AFHM). The AFHM variable was selected because it is important to understand the relationship between these sectors, which have been hit harshly by economic changes, and the food access inequality [30] . Finally, Mactaggart investigated how health disparities in rural settings may be exacerbated by AFHM occupations [31] . The independent variables of interest include those to test the first set of key hypotheses that diabetes and obesity rates differ in hot (and cold) spots compared to non-hot (and non-cold) spots. Other hypotheses are examined with interaction variables between the treatments and urban, rural, and household income variables. These hypotheses state that urban or rural aspects of census tracts accentuate the treatment differences; and there exist different chronic disease-income relationships at various income levels and across treatment and non-treatment census tracts.

Our triangulation of methods employs three publicly available datasets: food store locations (Source: WVfoodlink project), socioeconomic variables (Source: American Community Survey (ACS)), and health outcomes (Source: County Health Rankings and Roadmaps). Given difficulties in finding experimental and observational data for diabetes and obesity at the census tract level, a metric was developed through the following steps. The County Health Rankings and Roadmaps provide the data for the percentage of adults aged 20 and above with diagnosed diabetes at the county level. The same source also provides the percentage of adult population (age 20 and older) that reports a body mass index greater than or equal to 30 kg/m 2 which is considered as obese. In this research, the percentages of each county's diabetes and obesity were distributed to each census tracts.

Using previous research on the determinants of neighborhood food environments [14, 15] , the following socioeconomic characteristics were used as covariates to control for the selection into the treatments: number of family households; individuals who were age 25 and above who had a bachelor's degree; estimated vehicles available; households that received food stamp/SNAP benefits in the pasts 12 months; mean household income (adjusted in 2016 dollars); unemployed civilians who are in the labor force; age groups (<19 and >59); race (the proportion of non-white population); and health insurance coverage. Finally, community health centers (CHS) in each census tracts served as a control for health services. These data originated from the West Virginia Primary Care Association. Table A1 shows the sample means of all matching covariates among four groups: (1) hot spots as the treatment group when a census tract is in a hot spot, (2) non-hot spots that are the control group for hot spots, (3) cold spots as the treatment for cold spot analysis, and (4) non-cold spot control census tracts. Note that there are difference columns between the groups along with the corresponding t-test statistics with a null hypothesis of difference equal to zero. Table A2 compares the means of all matching covariates across the treatment groups and the synthetic group (control) obtained via entropy balancing. This table illustrates how entropy balancing improves the quality of matching between hot (cold) and non-hot (non-cold). When we compare the realizations of the pre-treatment characteristics of the hot spots and cold spots to those of the control group, the results reveal the efficacy of entropy balancing. After matching, all covariates are perfectly balanced, and no difference remains. Therefore, the control group in the next regression analysis is comprised of credible counterfactual for the sample of census tract that are hot or cold spots.

The appendix presents baseline model results to assess the impacts on diabetes and obesity rates from solely the treatment variable of populations residing in either a hot or a cold spot (Tables A3 and A4) . From these models, results consistently show that hot spots are healthier environments with decreases in diabetes and obesity rates compared to non-hot spots, while cold spots increase the rates for both chronic diseases compared to non-cold spots. In addition, both unbalanced and balanced models are estimated to compare the unbalanced and the entropy balanced models. As is conventional in the literature [25] , these results illustrate the overestimated bias observed from the unbalanced models compared to the corrected balanced models. Our interpretation is that by not accounting for covariates, such as the number of vehicles in a census tract, the mean household income, education, and age groups, this absence could have inflated the effect of living in non-treatment census tracts on diabetes and obesity rates.

Our main results consist of eight regression models showing four sets of treatment effects for hot and cold spots, respectively (Tables 1 and 2 ). These models examine the impacts of interactions between urban and rural designation with hot spots (columns (1) and (2)) along with the same interaction impacts with cold spots (columns (3) and (4)) on diabetes (Table 1 ) and obesity ( Table 2) rates. For urban hot spots, both diabetes (β= −0.0084, p-value = 0.095) and obesity (β = −0.0278, p-value = 0.004) rates are lower than non-hot, urban, and all rural census tracts. The reduction in obesity rates is particularly notable as it reflects a decrease of seven percent from the statewide average of 37.7 percent in 2016. Urban cold spots had mixed impacts on chronic diseases with a small, statistically significant impact on diabetes (β = −0.0061, p-value = 0.089) and a zero statistical impact on obesity rates. For rural census tracts, the treatment of either a hot or cold spot increased both diabetes and obesity rates by 1-3 percentage points relative to urban census tracts and rural, non-hot, or non-cold spots.

In Tables 1 and 2 , we also sought to determine whether diabetes and obesity rates in hot (and cold) spots are impacted differentially by differing household income levels. Within urban areas, the results point out that as income increases for the top household income quartile (75 percent), both diabetes and obesity rates increase within hot and cold spots. Since, as mentioned earlier in this manuscript, the food landscape in West Virginia is driven by convenience stores and small box retailers, these increases in rates could be evidence of food swamps causing chronic diseases ( [8, 10] ). In urban cold spots, we still found that there is no evidence of a diabetes rate differential among the lower and middle household income quartiles, and their counterparts in urban non-cold spots. However, we found that an increase in earnings of the top household income increases the rate of diabetes in urban cold spots. This result reinforces the view that urban areas are food swamps in West Virginia. In rural census tracts, the treatment-income interaction effect occurs primarily at the 50 percent and lower household income quartiles with statistically significant, negative impacts on chronic disease rates. The largest impacts from the treatment-income effect are on obesity rates. As an example, a one percent change in household income for the population at the 50 percent quartile in rural hot spots results in a 0.0003 percent reduction in obesity rates. When compared to the rural hot spot treatment increase in obesity rates of 0.0185 percentage points, household income at the 50 percent quartile would have to rise over 60 percent to offset the treatment effect of a hot spot in a rural area.

Our sensitivity analysis is decomposed into two parts: first, we relied on the propensity score weighting and the kernel matching method to analyze the sensitivity of our results-using other matching procedures gave us a sense of reassurance in the plausibility of our results. As explained earlier, these two methods are different to the entropy balancing approach because the propensity score weighting assigns to each control a weight that is equal to (1/(1 − P(X)) where P(X) is the propensity score; the kernel matching method uses the observed propensity scores to match treatment and control units while using the closest covariates in terms of propensity scores.

Second, we used the data from year 2017 (t + 1) to verify whether our conclusion holds. Since the locations of all food vendors were verified in 2016, we also ran the entropy balancing model for the year of 2017 because most of the covariates used in the matching process were collected from the American Community Survey (ACS) 2014-2018. Thus, this process helps us understand how our estimates change when we look at another year (t + 1) that is close to the time period for which we have data on the geolocations of food providers in West Virginia.

There are several similarities between our main model, entropy balancing (2016), and the propensity score weighting, the kernel matching method, and the 2017 results. Using the same logic, i.e., starting with regressions that include the interactions between the different treatments (urban and rural hot (and cold) spots) and the different income quartiles, we still find the same results for both techniques. However, the coefficients differ in magnitude with the entropy balancing method. The results are econometrically encouraging per se-the combination of the hot spot analysis and the entropy balancing is a solid way of understanding how the food environment affects chronic diseases in certain geographical entities; i.e., these results are not sensitive to time.

Overall, Tables A5-A10, the results of our sensitivity analyses, inform us that obesity and diabetes models hold up well. When comparing the range of treatment coefficient estimates for entropy balancing (2016) to the propensity score weighting, the kernel matching, and the entropy balancing for the year 2017 for hot and cold spots-the signs of almost all the coefficients are the same as in our entropy balancing model (2016). The only difference resides in the magnitudes of the coefficients. However, the entropy balancing results from the 2017 data are almost the same as those from our original models.

Our findings are of relevance to policymakers in rural states such as West Virginia. Overall, these results highlight the fact that places in West Virginia with access to different food vendors (hot spots) tend to have chronic disease outcomes associated with food swamps. This finding is not new in the literature. However, most researchers tend to separate between healthy and unhealthy food stores and find that individuals who have access to stores that offer a variety of food are healthier than inhabitants of food deserts [32] . In this research, we found that explicitly accounting for spatial forces (hot and cold spots) is better suited for a region where a food oasis might be the only choice. In a rural state such as West Virginia, some areas depend on convenience stores or Mom and Pop shops for all their groceries (https://www.investinginfood.com/commentary-w-v-projects-try-new-ways-to-deliverfresh-food-options/). Figure 2 shows that large numbers of cold spots are in the eastern part of West Virginia where Randolph, Pocahontas, Webster, and Tucker counties are considered mountainous and largely rural.

Therefore, separating between healthy or unhealthy stores might not do justice to residents of cold spots for example. Investigating whether a store is in an urban or rural area seems to tease out better the chronic disease-income relationship of individuals in that area. Moreover, our results also inform us that socioeconomic status (SES) plays a primal role, whether you live in a cold spot or a hot spot. On the one hand, these results showed consistent evidence that income growth in the top income quartile increases both chronic diseases within urban hot and cold spots. On the other hand, household income increases among the 50 percent quartile and lower income groups are linked to lower rates of chronic diseases within rural census tracts.

This research examined the differential impacts of income and access to retail food providers in West Virginia. After controlling for socioeconomic variables in hot and cold spots, we found that urban and rural areas in West Virginia are affected differently by the food landscape. First, we found that the household income impacts on chronic diseases among the top earners in both hot and cold spots located in urban areas are essentially zero. Thus, policies that aim to reduce chronic disease rates by improving access to healthy food alone in urban areas of West Virginia might not work as the problem could be more related to diet preferences than food access. As an explanation, we note that more than 76 percent of the food retail stores in West Virginia are composed of convenience stores and small box retailers so that households are more likely to patronize these types of stores. Phillips et al. highlighted the issue of food swamps and diabetes-related hospitalization and found a positive relationship between diabetes-related hospitalizations and a higher index of food swamp severity [10] .

Second, we consistently found that rural cold spots have higher chronic disease rates, and at the same time, increasing household income for the lower-income groups in these cold spots results in a lowering of chronic disease rates. Thus, improving access to healthy food providers might work in rural cold spots. Therefore, policies such as the healthy food financing that offer food retailers subsidies to locate in underserved areas should be encouraged in rural cold spots. Nonetheless, these policies should be accompanied by some means-tested benefits in order to boost the demand for nutritious food by low-income households. Such a policy recommendation is supported by other researchers who currently lobby for increases in SNAP benefits ( [9, 20] ). Our findings that income increases among low-income households reduce rates of chronic diseases shows that given the opportunity, these households would improve their healthy habits.

Despite significant contributions in methodology and implications, a number of limitations should be acknowledged. First, when examining the food environment in West Virginia, we did not separate between healthy stores and non-healthy stores. Figure 2 in our study demonstrates how stores that sell food in West Virginia are largely dominated by convenience stores. Without a detailed datasets on how people shop at the census tract level, it is almost impossible to capture the effect of a healthy store on the health conditions of different demographics in a rural state such as West Virginia. This issue also did not allow us to study whether the differences in obesity and diabetes rates between hot and cold spots and their comparable non-hot and non-cold spots census tracts are driven by demand tastes or by the supply (as we suggest in this study). Second, future studies should utilize spatially measured data for diabetes and obesity rates collected at the census tract level. Finally, the Center for Disease Control (CDC) includes individuals who have diabetes and severe obesity within the group of those who are at a higher risk of being severely affected by COVID-19 (https://www.cdc.gov/coronavirus/ 2019-ncov/need-extra-precautions/groups-at-higher-risk.html#severe-obesity). Due to the lack of data for the years 2016 and 2017 and given this recent global health issue, this research did not include this variable-a new avenue of research for further studies would be to understand how place effects that impact food intakes are associated with this novel pandemic. Statistical significance: '*'10 percent, '**' 5 percent, "***" 1 percent. p-value in parentheses. Table A4 . The impacts of hot and cold spots on obesity rates.

(2) Balanced Statistical significance: "*" 10 percent, "**" 5 percent, "***" 1 percent. p-value in parentheses. Statistical significance: "*" 10 percent, "**" 5 percent, "***" 1 percent. p-value in parentheses. Table A9 . The impacts of urban and rural hot (and cold) spots on diabetes (2017).

(1) Statistical significance: "*" 10 percent, "**" 5 percent, "***" 1 percent. p-value in parentheses.

Using a social ecological model to explore upstream and downstream solutions to rural food access for the elderly

Juggling the five dimensions of food access: Perceptions of rural low income residents

Co-constructing food access issues: Older adults in a rural food environment in West Virginia develop a photo narrative

Associations between body mass index, shopping behaviors, and characteristics of the neighborhood food environment among female adult Supplemental Nutrition Assistance Program (SNAP) participants in Eastern North Carolina

The local food environment and diet: A systematic review

Availability of healthier options in traditional and non-traditional rural fast-food outlets

New neighborhood grocery store increased awareness of food access but did not alter dietary habits or obesity

Food swamps predict obesity rates better than food deserts in the united states

Food deserts and the causes of nutritional inequality

US county food swamp severity and hospitalization rates among adults with diabetes: A nonlinear relationship

Food deserts"-Evidence and assumption in health policy making

Convenience stores are the key food environment influence on nutrients available from household food supplies in Texas border colonias

Traffic congestion and infant health: Evidence from e-zpass

Food environments and obesity: Household diet expenditure versus food deserts

Food swamps and food deserts in Baltimore city, md, usa: Associations with dietary behaviours among urban adolescent girls

Remedying food policy invisibility with spatial intersectionality: A case study in the Detroit metropolitan area

Longitudinal associations between neighborhood physical and social environments and incident type 2 diabetes mellitus: The multi-ethnic study of atherosclerosis (mesa)

Neighborhood social and physical environments and type 2 diabetes mellitus in African Americans

Is the focus on food deserts fruitless? In Retail Access and Food Purchases across the Socioeconomic Spectrum

Developing an agenda for research about policies to improve access to healthy foods in rural communities: A concept mapping study

Rural Grocery Stores: Importance and Challenges

Economic impact of organic agriculture hot spots in the United States

The effect of unemployment on the mental health of spouses-Evidence from plant closures in Germany

Entropy balancing for causal effects: A multivariate reweighting method to produce balanced samples in observational studies. Political Anal

The impact of us sanctions on poverty

Out of the outhouse: The impact of place-based policies on dwelling characteristics in Appalachia

Does matching overcome Lalonde's critique of non-experimental estimators?

Who benefits from calorie labeling? In An Analysis of Its Effects on Body Mass

The Economic Impact of Coal in West Virginia

Examining health and well-being outcomes associated with mining activity in rural communities of high-income countries: A systematic review

Does healthy food access matter in a french urban setting?

We would like to thank Michael O'Conor who contributed at the early stage of this project.

The authors declare no conflict of interest.