key: cord-0654240-3j1dbtzv authors: Barua, Limon; Zou, Bo; Zhou, Yan; Liu, Yulin title: Modeling Household Online Shopping Demand in the U.S.: A Machine Learning Approach and Comparative Investigation between 2009 and 2017 date: 2021-01-11 journal: nan DOI: nan sha: e70f11bd95f314c293ce1c08e11fc374bf0a6812 doc_id: 654240 cord_uid: 3j1dbtzv Despite the rapid growth of online shopping and research interest in the relationship between online and in-store shopping, national-level modeling and investigation of the demand for online shopping with a prediction focus remain limited in the literature. This paper differs from prior work and leverages two recent releases of the U.S. National Household Travel Survey (NHTS) data for 2009 and 2017 to develop machine learning (ML) models, specifically gradient boosting machine (GBM), for predicting household-level online shopping purchases. The NHTS data allow for not only conducting nationwide investigation but also at the level of households, which is more appropriate than at the individual level given the connected consumption and shopping needs of members in a household. We follow a systematic procedure for model development including employing Recursive Feature Elimination algorithm to select input variables (features) in order to reduce the risk of model overfitting and increase model explainability. Extensive post-modeling investigation is conducted in a comparative manner between 2009 and 2017, including quantifying the importance of each input variable in predicting online shopping demand, and characterizing value-dependent relationships between demand and the input variables. In doing so, two latest advances in machine learning techniques, namely Shapley value-based feature importance and Accumulated Local Effects plots, are adopted to overcome inherent drawbacks of the popular techniques in current ML modeling. The modeling and investigation are performed both at the national level and for three of the largest cities (New York, Los Angeles, and Houston). The models developed and insights gained can be used for online shopping-related freight demand generation and may also be considered for evaluating the potential impact of relevant policies on online shopping demand. Demand for online shopping is rapidly growing. In the U.S., between 2018 and 2019 the number of online transactions has increased by $76.46 billion, from $523.64 to $600.10 billion. In 2020, U.S. consumers were projected to spend $794.5 billion online, for which part of the growth is due to COVID-19 (Intelligence, 2020) . The rapid growth of online shopping has profound impacts on transportation. First, online shopping may substitute, complement, or modify personal travel to stores (Mokhtarian, 2002; Cao, 2009; Shi et al., 2019) and thus have implications for changing personal vehicle miles traveled (VMT). For example, one stream of research argues that reduction in personal VMT as a result of online shopping can be important in low-density areas where travel for shopping takes long distance (e.g., Farag et al., 2003; Goodchild and Wygonik, 2015) . An earlier study in the UK estimates that a direction substitution of car trips by delivery van trips could reduce vehicle-km by 70% or more (Cairns, 2005 ). Yet another stream of research supports a complementary effect, i.e., people frequently buying or searching online tend to make more shopping trips (e.g., Cao While a body of research has appeared toward understanding online shopping behavior (see Section 2 for a review of the literature), some important gaps remain. First, most of the existing studies focus on the interactive relationship between online and in-store shopping with ample but diverse empirical evidences (Shi et al., 2019) . However, the ability to predict the volume of online shopping with reasonable accuracy has not attracted much attention despite its practical importance for transportation planning. Almost all existing research resorts to econometric or statistical models. Based on the reported goodness-of-fit, many of those models would not be adequate for online shopping demand prediction purposes (although prevalent method for characterizing the relationships between response variable and input variables but can be problematic when input variables are correlated as is often the case, we employ a new approach called Accumulated Local Effects plots developed by Apley and Zhu (2020) that explicitly accounts for the presence of correlation of input variables and is also computationally less expensive. The remainder of the paper proceeds as follows. Section 2 reviews the relevant literature of online shopping. GBM model development is presented in Sections 3, followed by a description of the data used in the study in Section 4. Section 5 describes model implementation. Section 6 performs post-modeling analysis, including quantifying the importance of the input variables and the relationships between the input and response variables. Section 7 extends the modeling and analysis to three of the largest cities in the U.S. Finally, Section 8 concludes and suggests directions for future research. Our review of the literature is organized based on the data used: 1) dedicated survey data for local areas; 2) data as part of a larger travel survey for a metropolitan area; and 3) national-level data. Most of the studies on online shopping behavior are conducted using dedicated surveys conducted at specific locations. Farag et al. (2005) collect a data sample of 826 respondents from four municipalities in the Netherlands to investigate the effects of gender, age, income, land use characteristics, and car ownership on the relationship among frequencies of online searching, online buying, and nondaily shopping trips. Path analysis is conducted. The study is extended by Farag et al. (2007) in which structural equation modeling (SEM) is used. Using data of 392 Internet users from the Columbus metropolitan area in Ohio, Ren and Kwan (2009) estimate a negative binomial and a linear regression model to reexamine the effects of accessibility to local shops and the residential context on the adoption of e-shopping and the frequency of buying online. Age, gender, work hours, income, education, adult percentage in the household, Internet use, race, local population density, and shopping opportunity are included as input variables. Weltevreden and Rietbergen (2007) study the impact of online shopping on in-store shopping based on a dataset of 3,074 Internet users who shop at eight city centers in the Netherlands. The authors use multinomial regression and binomial logistic regression models and find that age, owning a credit card, Internet access and use, and car accessibility value at city centers have significant effects on online shopping. Using data of 539 adult Internet users in the Minneapolis-St. Paul metropolitan area, Cao et al. (2012) investigate the effects of age, the number of vehicles in the household, gender, driving license, income, education, occupation, and employment status on online shopping. It is found that online searching frequency has positive impacts on both online and in-store shopping frequencies and online buying positively affects in-store shopping. For further reviews of the earlier studies, readers may refer to Cao (2009) . Among the more recent research, Lee et al. (2017) use survey data from more than 2,000 residents in Davis, California to explore the effect of personal characteristics, attitudes, perceptions, and the built environment on the frequency of shopping online within three distinct shopping settings. Both univariate ordered response models and pairwise copula-based ordered response models are estimated. The authors find a complementary relationship between online and in-store shopping, even after controlling for demographic variables and attitudes. Using 952 Internet users from two cities in northern California, Zhai et al. (2017) examine the interactions between e-shopping and store-shopping for search goods (books) and experience goods (clothing). The authors find that, among other things, clothing is more likely than books to be associated with store visiting for Internet users. Maat and Konings (2018) investigate whether innovation diffusion or accessibility gains drive the replacement of physical shopping by online shopping, by estimating fractional logit models based on a survey of 534 respondents in Leiden, the Netherlands. Focusing on e-shopping behavior in China, Ding and Lu (2017) use a data sample of 791 respondents from a GPS-based activity travel diary in the Shangdi area of Beijing and develop SEM to investigate the relationships between online shopping, in-store shopping, and other dimensions of activity travel behavior. Similarly, SEM is performed to examine the interaction between e-shopping and in-store shopping using a data sample of 1,032 respondents in the city of Nanjing (Xi et al., 2020) . Shi et al. (2019) perform regression analysis using data from interviews with 710 respondents in Chengdu. It is found that e-shopping behavior is significantly affected by sociodemographics, Internet experience, car ownership, and location factors. In addition, the results suggest that e-shopping has a substitution effect on the frequency of shopping trips. The association of spatial attributes with e-shopping is studied in Zhen et al. (2018) . As online shopping is gaining increasing popularity, online shopping information has been incorporated into metropolitan area travel surveys. The use of the information for understanding online shopping behavior is explored by several researchers. Ferrell (2004 Ferrell ( , 2005 use the San Francisco Bay Area Travel Survey 2000 data to investigate the relationship between home-based teleshopping and shopping travel. In Ferrell (2004) , the relationship between travel behavior (number of trips, travel distance, and trip chaining) and home-based teleshopping is explored using linear regression. In Ferrell (2005) , the impacts of age, car availability, household income, Internet, homeownership, driving license, education, and health condition of an individual on home-based teleshopping are explored by using SEM. Dias et al. (2020) use the 2017 Puget Sound Household Travel Survey data to explore the relationship between online and inperson engagement in the shopping domain while distinguishing between shopping for non-grocery goods, grocery goods, and ready-to-eat meals. The effects of the number of adults, employment status, population density, household tenure, household type, vehicle availability, and household income on household-level online shopping are explored. As mentioned in Section 1, due to the scarcity of data and perhaps also unawareness among researchers of the online shopping-related information that has been added to national data sources, national-level research of online shopping behavior remains more limited than studies using dedicated local surveys or metropolitan area travel surveys reviewed above. We are aware of four studies in which national-level datasets are used. Three of them relate to the NHTS data. Zhou and Wang (2014) explore the relationship between online shopping and shopping trips by analyzing the travel pattern-related variables (number of shopping trips, total number of trips, average travel time, gas price) from the 2009 NHTS data. Using the same dataset, Wang and Zhou (2015) develop a binary choice model and a censored negative binomial model to investigate the effects of the Internet, education, age, gender, race, household size, number of household vehicles, home type, population density, rural, and urban size on home delivery frequency. Ramirez (2019) performs negative binomial regression using the 2017 NHTS data to explore the impacts of gender, age, household income, race, education, job category, urban/rural, and the number of drivers in the household on online shopping demand. Besides NHTS data, another national-level data source is the 2016 American Time Use Survey, which is used in Jaller and Pahwa (2020) to investigate the environmental impacts of online shopping. Factors including gender, age, education, employment status, household income, population density, and season are considered to understand their effect on online shopping decisions. Table 1 summarizes the above reviewed studies with a U.S. focus, given that our interest in this paper is also in U.S. online shopping. In the table, we present the data sources, sample types, and modeling techniques. As is clear in the table and from the review above, all these studies resort to econometric or statistical modeling. Many of the existing studies focus on the relationship between online shopping and in-store shopping, whereas the ability to predict online shopping demand with reasonable accuracy has not been paid attention to despite its importance for transportation planning. Also, econometric/statistical modeling techniques often give an estimate of the effect of an input variable as a single number. However, the effect could vary by the value of the input variable. The constrained, single number-based effect estimates in turn limit the ability of the models to serve demand prediction purposes. Moreover, as online shopping is continuously developing, there is a need but no research for understanding the evolving influence of different input variables on online shopping over time at the national as well as local levels. By leveraging ML and some of its latest advances, our research tries to fill these gaps. the selection of input variables is internalized in the decision tree, making the algorithm robust to irrelevant input variables. With these strengths, GBM has been reported to yield better prediction than traditional statistical models (e.g., linear regression and ARIMA) and other ML models (e.g., Random Forest (RF), and SVM) on a number of prediction tasks (Ogutu et al., 2011; Zhang and Haghani, 2015) . This section provides a description of the methodology for GBM model development, consisting of three steps: model training, validation, and testing. In line with the three steps, the data used for model development are split into three portions. Following the rule-of-the-thumb (Bisong, 2019), a 60-20-20 split of the data is adopted. Step 1: Model training. Use the first portion (60%) of data to train GBM models under different combinations of model hyperparameters. Step 2: Model validation. Use the second portion (20%) of data for model validation. This step involves selecting a trained model with the best prediction accuracy but not subject to overfitting. Step 3: Model testing. Use the remaining portion (20%) of data to further test the prediction accuracy of the selected GBM model. where ( , ) is the loss function associated with and (e.g., squared error ( − ) 2 ). Thus, the goal of model training can be approximately viewed as minimizing the model prediction error. The response variable may come from different distributions. In ML theory, the different distributions naturally lead to different specifications for the loss function ( , ). Given that online shopping demand is a continuous response variable, the 2 square loss function: ( , ) 2 = 1 2 ( − ) 2 and the robust regression Huber loss function ( , ) Huber, are often used (Natekin and Knoll, 2013) . We choose the Huber loss function, which captures not only 2 square loss but also mean absolute error 1 . As shown in Eq. (2), ( , ) Huber, is ( , ) 2 when the absolute error of prediction | − | is smaller than or equal to , but becomes ( , ) 1 = | − | with a multiplier minus a constant term 2 2 when the absolute error of prediction is greater. Following the common procedure in GBM, we parameterize ( ) as ( ; ) where = { 1 , 2 , … } is a finite set of parameters. Choosing a parameterized function ( ; ) then changes to the following problem of parameter optimization: * = argmin , ( , ( ; )) Consequently, * ( ) = ( ; * ). To determine * , we employ steepest descent as the numerical minimization method, which iteratively updates * as in Eq. (4): where is the th element in P. is obtained from line search as follows: Note that the minimization problem of (5) only involves one decision variable . GBM views each point in as a "parameter" (so there are "parameters"). Then, the iterative relationship in steepest descent that corresponds to (4) becomes: However, there is a key difference here that prevents direct application of the above steepest descent. That is, the gradient is defined only at the data points { } 1 but cannot be generalized to other x-values. One way of generalization, according to Friedman (2001) , is to parameterize ( ) as: where { , } 1 are parameters. is the maximum number of iterations in performing the GBMequivalent steepest descent. The generic functions ℎ( ; ) , = 1,2, … , are usually simple parameterized functions of the input variables , characterized by parameters = { 1 , 2 , … }. In GBM, ℎ( ; ) is called a "base learner" and is often a classification tree. In this paper, we consider the following regress trees specification for ℎ( ; ): This can be obtained from the following least-square minimization problem, the reason being that solutions to least-square minimization problems have been well studied and thus can follow standard procedures. which is used to update ( ): where ∈ (0,1] is the learning rate, a hyperparameter in the GBM model. Considering a learning rate less than one attempts to prevent overfitting by "shrinking" the update of ( ). Previous numerical experiments revealed that a small can result in better prediction performance of GBM models. In ML, the process represented by (9)-(11) is called "boosting". The overall procedure thus gets the name of "gradient boosting". Overall, the GBM algorithm can be summarized as follows: Model validation consists of identifying the combination of hyperparameter values that yields the best model fit without overfitting. To select the GBM model with the highest prediction accuracy, 2 is used: where denotes the observed value of the th observation, ̂ is the corresponding predicted value, ̅ is the mean of the observed values: ̅ = 1 ∑ =1 . We calculate 2 for each trained model and sort the models in descending order based on 2 . These models are then evaluated one by one starting from the one with the highest 2 , as follows. We apply a trained model to the validation dataset to generate predicted values and calculate 2 . If the difference between this 2 and the 2 associated with the training dataset is less than a threshold (0.1 in this study), then the model is selected as the best model. Otherwise, the difference in 2 suggests presence of overfitting. Then the model is discarded and the next model for evaluation is studied. In the end, the best combination of hyperparameter values, which correspond to the first encountered model without overfitting, is identified. To further assure that the selected hyperparameter values lead to a good GBM model, -fold cross validation is also performed. Specifically, the training and validation datasets are merged and randomly divided into subsets. Then, − 1 subsets are selected for training a GBM model using the selected hyperparameter values. The trained model is then used for prediction using the remaining subset. 2 's of the training subset and the testing subset are calculated. This process is repeated times. If the average 2 associated with model validation is much lower than with model training, then the hyperparameter values are discarded. The next best combination of hyperparameter values (based on description of the previous paragraph) is evaluated. Otherwise, the selected hyperparameters and associated GBM model are kept. Given the selected GBM model, the model testing step is to provide an understanding about how accurate the model prediction could be on new data. Specifically, after model training and validation, the remaining 20% of the data not used in the previous two steps are used to check if the model can still yield good accuracy in prediction. In measuring the prediction accuracy, root-mean-square error (RMSE) is used in addition to 2 . As shown in Eq. (13), RMSE is defined as the square root of the average of squared differences between predicted and observed values over all observations. A lower RMSE value means a smaller average difference between and ̂, thus a better fit of the model. where ̂ and are the predicted and observed values of the th observation. ′ is the testing data size. The Given the large number (48) of candidate input variables, it will be desirable to build less complex models with fewer features, by deciding which input variables are essential for prediction and which are not. This can be useful when one wants to reduce the risk of overfitting and increase model explainability (Guyon et al., 2002; Burkov, 2019) . To this end, some feature selection procedure needs to be performed. The idea is to discard input variables that make limited contributions to model predictability. In this paper, we consider Recursive Feature Elimination (RFE) algorithm, which requires moderate computation efforts (Guyon et al., 2002) and is shown to perform better than other feature selection techniques such as least Train and identify the best trained GBM model using ( , ) 4. Compute 10-fold cross validation score for the model 5. Determine feature importance 6. Identify and remove input variable ′ with the least importance 7. Update input variables ← − ′ 8. Until stopping criteria is met After implementing RFE algorithm, 16 input variables are retained for both 2009 and 2017. Interestingly, the 16 input variables are the same for both years, as listed in Table 2 below. Table 3 provides summary statistics of these variables. With the 16 input variables, two GBM models are developed, one for 2009 and the other for 2017. The RMSE values presented in Table 5 To further examine the prediction performance of the GBM models, we compare the models with several alternative models including linear regression, quadratic regression, SVM, and RF, using the same In Eq. (14) By applying the Shapley value-based method, the importance of all input variables in the GBM models for 2009 and 2017 is computed with results displayed in Fig. 2 (ranked based on importance in 2017). We also present the change in the ranking of importance of the input variables between 2009 and 2017 in Fig. 3 , where blue arrows indicate no change in ranking, red arrows denote ranking drops, and green arrows represent ranking rises. For the discussions below, we focus on the ranking and ranking changes of the input variables. In what follows, we present the ALE plots in four subsections (6.2.1-6.2.4) each corresponding to one category of input variables shown in Table 2 . In each category, the input variables are arranged in the order of their feature importance in 2017 (shown on the right column in Fig. 3 ). The ALE plots for input variables in the socioeconomic characteristics category is presented in Fig. 4 . For household income, we observe that in 2009 online shopping purchases of a household slightly decreases when household income increases from $2,500 to around $15,000 and then increases more monotonically. Turning to the two education related variables, the percentage of household members with a bachelor's degree does not give a clear-cut message. In both years, the highest online purchases occur when a household has part of its members with a bachelor's degree. While some prior investigations support that higher education increases one's Internet use capability, which enables and encourages online shopping (Farag et al., 2007; Cao et al., 2012) , the non-monotonic relationship found here is more in line with the arguments in other existing research that education background has no, negative, or mixed effects on online shopping and that online shopping is actually a relatively easy task that does not require higher education (Mahmood et al., 2004; Zhou et al., 2007) . Nonetheless, we speculate that some basic Internet literacy is still needed. A too shallow education background may still affect a household's propensity for online shopping. This is supported by the overall negative relationship between online purchases and the percentage of household members without a high school degree. It is also interesting to observe that the lowest propensity is achieved when members without a high school degree dominate a household (50%), and remain the same low level as the percentage increases. Finally, the ALE plot shows that owning the home property tends to encourage online purchases. The difference is even amplified in 2017 compared to 2009. A possible reason for the renting-owning difference is that owning a home property (e.g., owning a single-family house as opposed to renting an apartment unit) gives a household a sense of permanency and possibly more space (a single-family house is likely to be larger than an apartment unit), and consequently makes the household purchase more to improve the living place (buying appliances, decorations, etc.), whereas such motivation would be less if just temporarily renting a place. The ALE plots for input variables in the trip characteristics category are presented in Fig. 5 . First, the ALE of gas price shows some interesting results. In 2009, online purchases decrease when gas price increases from $1.5/gallon to around $2.25/gallon, and then stay roughly constant when the gas price is between $2.25/gallon and about $4.0/gallon. But online purchases start to increase as gas price goes beyond $4.0/gallon. The initial decline seems counterintuitive at first sight. A possible explanation, following Ma et al. (2011) , is that as the initial gas price increases from a low base price, the dominant factor affecting online purchases may be the reduction in the budget allocatable for shopping, which leads to a decline in online purchases. On the other hand, the increasing trend when gas price is over $4.0/gallon is understandable: as gas price increases, driving becomes more expensive, adding to the generalized travel cost to go to stores. Consequently, online shopping becomes more attractive. In contrast to 2009, the overall trend of online shopping varies less in 2017 over a narrower range of gas price, though with some fluctuations. The difference in the range coverage of gas price in the two years is due to less variation of gas price in 2017 than in 2009. In general, households seem to be less sensitive to gas price when purchasing online in 2017. Turning to the ALE plot for travel time of household members per day, the two curves for 2009 and 2017 both follow an overall increasing trend. As a household spends more time traveling, it is likely to have less time available for shopping. As online shopping requires less time and activities than in-store shopping, household members with less shopping time are naturally more inclined to purchase over the Internet. We also note that when household travel time is near zero, the ALE values are actually not, or even close to the lowest. Our speculation is that people with almost no travel at all will spend most of the time at home, thus likely taking care of things including shopping through the Internet as much as possible. This effect seems more evident for 2017. The ALE plots for the input variables in the category of land use characteristics are presented in Fig. 6 . For population density, we observe a "V" shape, or a first-decreasing-then-increasing trend, which can be explained as follows. When population density is very low, it probably would require a long trip to get to a nearby store for shopping. In this case, shopping over the Internet would be more convenient saving households a substantial amount of shopping-related travel time. As population density increases, the time spent in going to stores is decreased. As a result, households will be more willing to shop in stores. As population density continues to increase, households again become more inclined to online shopping, which may be attributed to two factors. First, greater population density means greater human interactions in working, social, and other contexts, reducing the time available for in-store shopping. Second, previous research has argued that people living in dense areas tend to have greater access to the Internet (Loomis and Taylor, 2012), which is essential to online shopping. Related to this, households in an urban location tend to shop more than in non-urban areas. Between the two years, the effects of population density and urban location are stronger in 2017 than in 2009. The ALE plot for the binary input variable indicating whether all members in a household use the Internet daily is presented in Fig. 7 . Since the variable is binary, ALE is presented in two bars for each year, one with daily Internet use and the other without. The plot clearly shows that daily Internet usage has a significant impact on online purchases for both 2009 and 2017, which supports the argument that more frequent use of the Internet enables more online shopping. This may also be attributed to additional online shopping demand that is "induced" from more frequent Internet use, a phenomenon that has been seen in other transportation contexts (e.g., Cervero and Hansen, 2002; Zou and Hansen, 2012) . With daily Internet use, the average number of online purchases in a household in a 30-day period will be about 1.1 higher than otherwise. In 2017, the difference is slightly smaller (about 0.9). follow the same procedure of model training, validation, and testing described in Section 3, now using city-specific data. Thus, six city-specific GBM models are trained. The data size (number of household observations), best hyperparameter values, and 2 of the GBM models based on the testing data are reported in Table 7 . We observe that the 2 's of the city-specific models are lower than the national-level models (in Table 4 ). This is not surprising and attributed to a much smaller number of data points for training each city-specific model than training the national models. Using the developed city-specific models, ALEs are plotted with respect to each of the input variables, for each city and for 2009 and 2017. We find that overall, the ALE plots follow the trends of the national models in subsection 6.2, with some interesting new findings for some or all of the three big cities. These new findings are presented below (Fig. 8-15 ). Each figure corresponds to one input variable, with the left graph plotting ALE for 2009 and the right graph plotting ALE for 2017. For input variables in the socioeconomic characteristics category, the first interesting finding occurs to household income (Fig. 8) (Fig. 9 ). In 2017, the insensitivity remains for Los Angeles and Houston for household size between two and five (six). The relationship for New York follows more the national trend. It is interesting to see that for all three cities, a two-member household has significantly more online purchases than a single-member household in 2017, which is different from 2009. Fig. 10 shows that the number of vehicles of a household seems to affect online purchases in these big cities mostly with a limit, which is unlike the national level (Fig. 4) when the number of vehicles in a household becomes large. In 2009, the effect is plateaued in Los Angeles and Houston after the number of vehicles reaches six. In 2017, that number is four for New York (roughly plateaued), seven for Los Angeles, and six for Houston. is that these cities constantly welcome new comers (college graduates and people moving in for jobs) who start their lives in these cities by first renting a place and need to purchase lots of items. Also, in these big cities rented and owned property types may be more similar (e.g., mostly apartment units/condos) than at the national level (renting an apartment unit vs. owning a single-family house). Thus, the potential property size effect associated with home ownership is diminished. The absolute value of the difference between renting vs. owning is also much larger for the big cities than for the national average, with the largest difference being 0.12 in Los Angeles (vs. about 0.007 nationally) in 2009, and nearly 0.14 in New York (vs. about 0.04 nationally) in 2017, which also corroborates the more active online shopping of households in these big cities. For land use related variables, Fig. 14 shows that unlike the national-level trend, online purchases are not much sensitive to population density when the density of the area is high (above 17,000 people/sq. mile in both years). Fig. 15 further illustrates that the difference of living in urban and non-urban areas becomes much less in 2017 than in 2009, which again differs from the national trend and is a sign that online shopping becomes more prevalent across different areas in each of the cities. A possible explanation is that the proximity and relatively small socioeconomic and geographical differences of urban and "non-urban" (i.e., suburban) areas in each city (at the national level, "non-urban" can include really remote rural areas that have very different characteristics from urban areas) facilitate the spread of online shopping within each city. It is interesting to see that for Los Angeles, households in non-urban areas shop even more online than in urban areas. The modeling results show that GBM yields much higher prediction accuracy than several other ML (including regression) models. We find that household income contributes the most to predicting online shopping demand. Over time, the importance of Internet use and gender diminishes, while household member age and household size become more important. By employing the ALE technique, valuedependent effects of the input variables on predicted online shopping demand are estimated, which provide richer insights than single-number estimates as in prior research. The estimates show that the effect of the percentage of household members receiving higher education is not monotonic. The generation that grew up with online shopping significantly influence the effect of adult percentage in a household. Households owning home property tend to buy more online than if renting a living place. Total travel time of a household has an overall positive relationship with online purchases. However, the number of trips has a non-monotonic effect, with an explanation that more trips not only reduce the available time for shopping but also increase the chance of buying things on the way. The ALE plot for shopping trip percentage provides a mixed effect, suggesting that complementary and substitution relationships may both exist between online and in-store shopping. The relationship between population density of the living neighborhood and online purchases follows a "V" shape with plausible influencing factors being in-store shopping distance, social interactions, and Internet access. Living in an urban area and having daily Internet use encourage online shopping. As online shopping becomes more prevalent over time, the ALE plots further reveal the differences between 2009 and 2017. We also look into online shopping demand in three of the largest cities in the U.S., and discover commonalities with the national-level results as well as some unique characteristics for these cities. This paper presents a beginning of taking a machine learning approach for predicting household-level online shopping demand, and for revealing the importance of influencing factors and their relationships with the demand. The models developed and insights gained can be used for online shopping-related freight demand generation and may also be considered for evaluating the potential impact on online shopping demand of relevant policies, e.g., land use planning, gasoline pricing, and transportation demand management to reduce trip-making. The proposed modeling approach could be further used as future releases of NTHS or similar data become available, which will help gain more in-depth understanding of the evolution of input variable importance and their relationships with household online shopping demand. The modeling and analysis could be extended with more advanced approaches, e.g., by combining GBM and a support vector classifier which first classifies household locations so that even higher prediction accuracy could be achieved. Process Variable Importance Analysis by Use of Random Forests in a Visualizing the effects of predictor variables in black box supervised learning models A gradient boosting approach to understanding airport runway and taxiway pavement deterioration Transforming last-mile logistics: Opportunities for more sustainable deliveries Principles of Learning The hundred-page machine learning book Delivering supermarket shopping: more or less traffic E-shopping, spatial attributes, and personal travel: a review of empirical studies The interactions between e-shopping and traditional in-store shopping: an application of structural equations model Induced travel demand and induced road investment: A simultaneous equation analysis Impact of drone delivery on sustainability and cost: Realizing the UAV potential through vehicle routing optimization Advanced freight transportation systems for congested urban areas Age related differences in learning to use a text-editing system A comparison of online and in-person activity engagement: The case of shopping and eating meals Applying gradient boosting decision trees to examine nonlinear effects of the built environment on driving distance in Oslo The interactions between online shopping and personal activity travel behavior: an analysis with a GPS-based activity travel diary A working guide to boosted regression trees Exploring the use of e-shopping and its impact on personal travel behavior in the Netherlands Empirical investigation of online searching and buying and their relationship to shopping trips Shopping online and/or in-store? A structural equation model of the relationships between e-shopping and in-store shopping 2009 National Household Travel Survey 2017 National Household Travel Survey Home-based teleshoppers and shopping travel: Do teleshoppers travel less Home-based teleshopping and shopping travel: Where do people find the time? Greedy function approximation: a gradient boosting machine Selecting the most important self-assessed features for predicting conversion to Mild Cognitive Impairment with Random Forest and Permutation-based methods Changing retail business models and the impact on CO2 emissions from transport: e-commerce deliveries in urban and rural areas Gene selection for cancer classification using support vector machines The elements of statistical learning: data mining, inference, and prediction Age, gender and income: do they really moderate online shopping behaviour Crowdsourcing incentives for multi-hop urban parcel delivery network A variable impacts measurement in random forest for mobile cloud computing Economies of density in e-commerce: A study of Amazon's fulfillment center network (No. w23361) US retail ecommerce sells Evaluating the environmental impacts of online shopping: A behavioral and transportation approach Design and modeling of a crowdsource-enabled system for urban parcel relay and delivery Drone based parcel delivery using the rooftops of city buildings: Model and solution Picture of online shoppers: Specific focus on Davis Relationships between the online and in-store shopping frequency of Davis, California residents Forecasting the Internet: understanding the explosive growth of data communications A unified approach to interpreting model predictions An empirical investigation of the impact of gasoline prices on grocery shopping behavior Accessibility or innovation? store shopping trips versus online shopping On-line shopping behavior: Cross-country empirical research Telecommunications and travel: The case for complementarity Interpretable machine learning The gender gap in Internet use: Why men use the Internet more than women-a literature review Age differences in technology adoption decisions: Implications for a changing work force Gradient boosting machines, a tutorial A comparison of random forests, boosting and support vector machines for genomic selection Internet/broadband fact sheet Study of the relationship between online shopping and home-based shopping trips (Doctoral dissertation Proactive vehicle routing with inferred demand to solve the bikesharing rebalancing problem The impact of geographic context on e-shopping behavior. Environment and Planning B: Planning and Design The distribution network of Amazon and the footprint of freight digitalization A value for n-person games Does e-shopping replace shopping trips? Empirical evidence from Chengdu, China Unbiased split selection for classification trees based on the Gini index A phenomenological investigation of Internet usage among older individuals Why don't men ever stop to ask for directions? Gender, social influence, and their role in technology acceptance and usage behavior Deliveries to residential units: A rising form of freight transportation in the US E-Shopping versus City Centre Shopping: The Role of Perceived City Centre Attractiveness The interaction between e-shopping and store shopping: Empirical evidence from Nanjing Nonlinear feature selection using Gaussian kernel SVM-RFE for fault diagnosis The interactions between e-shopping and store shopping in the shopping process for search goods and experience goods A gradient boosting method to improve travel time prediction Causal interpretations of black-box models The association between spatial attributes and eshopping in the shopping process for search goods and experience goods: Evidence from Nanjing Explore the relationship between online shopping and shopping trips: an analysis with the 2009 NHTS data Unbiased measurement of feature importance in tree-based methods Flight delays, capacity investment and social welfare under air transport supply-demand equilibrium This research is supported in part by the U.S. National Science Foundation and the U.S. Department of Energy through the Argonne National Laboratory. Opinions expressed herein do not necessarily reflect those of the two agencies.