key: cord-0058977-60pyjk6b authors: Politis, Ioannis; Fyrogenis, Ioannis; Papadopoulos, Efthymis; Nikolaidou, Anastasia; Verani, Eleni title: Understanding Willingness to Use Dockless Bike Sharing Systems Through Tree and Forest Analytics date: 2020-08-19 journal: Computational Science and Its Applications - ICCSA 2020 DOI: 10.1007/978-3-030-58802-1_56 sha: e118d7f02707c50076e6323e153390b15fee3618 doc_id: 58977 cord_uid: 60pyjk6b In this paper we explore factors that affect Bike Sharing System (BSS) usage and how they differentiate between discrete groups of potential users. BSS have known a rampant growth during recent years, through technological advances, re-evaluated business models and reinvention of the mean’s utility. Yet, for a realized use of dockless BSS and a successful integration in the urban mobility ecosystem to be achieved, the factors that promote willingness to use them need to be explored. By using a sample of 500 stated preference data, classification trees and random forest models are built for three groups of potential BSS users; car users, bus users and pedestrians. Among the considered factors are BSS cost gains, BSS In Vehicle Time (IVT) and Out of Vehicle Time (OVT) gains, trip frequency, purpose and duration. More specific, it was found that BSS potential, increases for short duration trips of up to 21 min for car users. Bus users and pedestrians were found to be more likely to choose a BSS option for a higher cost up to 0,60 and 0,75 euros respectively. On the other side sociodemographic characteristics such as household income, gender, education level and occupation did not found to be the dominant factors for the mode choice decision. OVT is found only to be relatively important for bus users, while the cost gains are comparatively more significant for bus users and pedestrians. Urbanization has been an increasingly rising trend in recent years. Cities are expected to amass 68% of the world's population by 2050 [1] . This combined with the cardominant paradigm of recent years, makes adopting socially, environmentally and economically sustainable practices an urgent necessity. The emergence of sharing economies and shared mobility in particularly is widely considered a much promising solution to this predicament. More specifically, increased usage of Bike Sharing Systems (BSS) and cycling in general has been correlated with an abundance of positive externalities, such as health benefits [2] , reduced traffic congestion [3] such as environmental benefits [4] . BSSs' evolutionary history can be summed up in four generations of systems. The first generation of BSSs first appeared in 1965 in Amsterdam and they became known as "White Bikes" [5] . The bikes were randomly allocated in the city center and free for public use, something that made them susceptible to wear, vandalism and theft and resulted in the system's cease of operations. The second generation, known as "Coin Deposit Systems" began operations in Denmark in 1991 consisted of bikes with a lock installed, that users could unlock by a refundable deposit [4] . While this attempted to improve on the first generation's failures, it only partially did, as there was no mechanism in place to limit the bike-share usage times and the users often kept the bicycles for extended periods of time. The third generation of BSSs, that started operating in France in 1998 (known as "IT based systems") was better equipped to handle those problems [5] . It included docking stations with smart technology that made it possible to know when a bike was taken to and from a station [6] . It also incorporated electronic means of payment and high deposits that made it possible to identify the users and deter theft. While the fourth generation of BSS was already defined by many field experts before the rise of dockless BSS, as a system that is demand responsive and a fully integrated part of the city's transportation system, during recent years it has come to be identified with the rise of dockless BSSs [6, 7] . Dockless BSS' explosive growth started in China in 2015 and expanded worldwide. The systems' flexibility allows users to choose the starting and ending point of their trips without depending on rigidly placed docking stations. This is made possible by Global Position System (GPS) sensors that are installed on its bike and the widespread use of smart devices by users that allow them to easily locate, unlock and ride the bicycles in the urban environment [8] . Currently, the operational BSSs around the world are a mix of third and fourth generation systems, that often operate complimentary to one another. Taken aback by this surge in usage and the sudden emergence of bicyclesand later scootersin city streets, municipalities and governments lagged behind with regulations, something that created difficulties with formulating frameworks that would enable BSSs to become an integrated part of urban transportation [9] . For BSS and micromobility to reach a fully realized and optimized usage that will make it a part of a robust and resilient urban mobility system, the factors and mechanisms that make it an appealing mode of transport and increase its modal share need to be thoroughly understood. A lot of work has already been published that explores both factors and user characteristics that enhance BSS usage. A big amount of the literature focuses on identifying the characteristics of BSS users, comparing them to the non-user population, mainly using revealed preference data, both in form of system-use data and user surveys. Regarding the sociodemographic characteristics of BSSs users many of the studies found significant difference between bike-sharing users and the general population. A large number of studies found current BSS users to be younger on average compared to the general population [5, 10, 11] . While the users' income was found to be significantly different by the rest of the population in a number of studies, it was not consistently found to be higher or lower [5, [11] [12] [13] . A higher education level was also identified as a contributing factor in some studies [10, 14] . While most studies agree that male users are more likely to prefer BSS, some conclude that females are [10] [11] [12] 15] . Other parts of pertinent literature have examined the way environmental factors contribute towards increased BSS usage. Improved infrastructure, such as an extended network of bicycle lanes, mixed land use, docked BSS station placemaking and weather conditions have all been found to significantly affect BSS usage [16] [17] [18] [19] . On the contrary, little work has been done to examine users' willingness to switch to BSS usage. Campbell et al. [20] determined that trip characteristics such as trip distance, temperature, precipitation and air quality were factors that mainly effected whether users considered using BSS instead of their currently preferred mode of transport, while sociodemographic characteristics weren't found as important. Li & Kamargianni utilize short-distance, revealed and stated preference data to examine mode choice, with emphasis on bike-sharing. The results showed that choosing BSS heavily depended on air pollution, weather conditions, cost and travel time and less so on socioeconomic characteristics [21] . Based on the existing literature, a question that arises is which factors most heavily incentivize potential BSS users, that currently prefer other modes of transport, to use bike sharing across all urban trip durations: • Does the proposed population clustering (e.g. car-bus-foot) effectively support different micro-mobility strategies? • If yes, how and in which manner the various mode and personal characteristics are affecting the mode choice process? What are the threshold values for changing the mode choice behavior towards dockless BSS? • Is the proposed tools used in the study (classification trees and random forest analytics) appropriate to interpret dockless BSS potential? In order to tackle those questions, a stated preference survey was designed on the Limesurvey platform [22] and deployed on-field by trained interviewers, equipped with tablets. The data collection took place from April 2019 to May 2019 in Thessaloniki, Greece and a sample of 500 answers was collected. The survey was answered by users of the three currently dominant modes of transport in Thessaloniki; car users, bus users and pedestrians, depending on which of those modes they used for their most frequent trip. The interviewees were asked about the characteristics of their last typical trip, including duration, IVT (In Vehicle Time) and OVT (Out of Vehicle Time), cost etc. Using those answers, competing values of cost, IVT and OVT for the BSS (or total trip time in the case of pedestrians) were calculated by the survey algorithm and presented to the interviewees in the last section of the survey as a set of 9 mode choosing scenarios. In those scenarios the interviewees had to choose between their current mode of preference, with the cost, IVT and OVT they stated, and the dockless BSS with varying competing combinations of cost, OVT and IVT. The variables that were collected via the survey, along with their types, levels when applied and descriptions can be seen in Table 1 . For each of the three groups of potential users (car users, bus users and pedestrians) a classification tree and a random forest model were built. The sample was split into a training and a testing set using a 4:1 split ratio and the models' results were validated using the testing set. 10-fold cross validation was used to fit the models and the hyperparameters of both models were tuned using grid search to select the optimal value in order to increase the model's predictivity. To achieve shorter interpretable trees, the tree models were furtherly pruned for the minimum leaf node size to be consisting of 50 or more observations. The analysis was performed in the R language for statistical computing [23] . The package dplyr [24] was used for data manipulation and the packages rpart [25] , randomForest [26] and caret [27] for model fitting. The results were visualized with the rpart.plot [28] and ggplot2 [29] packages. The classification error of the classification tree shown in Fig. 1 was 18 .59%. As can be easily interpreted from the tree, variables related to mode specific features such as duration and cost are closer to the root node (upper part of the tree). Reduced trip duration and especially increased IVT gain appear to be critical factors towards preferring the BSS. The thresholds chosen by the model for those factors are relatively low, as the first split of the dataset happens for trips shorter or longer than 21 min, while car users would require significant IVT benefits in order to make the switch to BSS. On the other hand while reduced cost gain does make preferring the BSS more likely, car users can prefer the BSS even at a slightly higher cost per trip. The combination of factors that seems to increase the likelihood of prefering the BSS over the car is a combination of low trip duration, high IVT gains, relatively younger age groups and work as a trip purpose. The random forest model that was fitted for car users had a classification error of 10.70%. The relative importance of the variables that were used in the model is shown in Fig. 2 . Similarly, to the results of the classification tree the variables that contribute most to decision making are cost gain, trip duration and IVT gain, while household income also stands out. The classification error of the classification tree shown in Fig. 3 was 25 .86%. As can be seen by the tree, variables relative to mode specific factors are closer to the root node of the three and seem more crucial toward splitting the sample in homogenous groups based on mode choice. Increased cost gain appears to have an important role towards choosing the BSS over the bus. Users with higher education were more likely to prefer the BSS, especially with decreased IVT compared to the bus. The combination of factors that increases the likelihood of preferring the BSS the most is users with higher education that perform shorter duration trips, on a daily basis but only if the BSS is not a much more expensive option. The random forest model that was fitted for bus users has a classification error of 11.64%. The relative importance of the variables that were used in the model is shown in Fig. 4 . Similarly to the results of the classification tree the most important factor in terms of contributing to decision making is the cost gain, while IVT gain, trip duration and OVT (Out of Vehicle Time) gain, as well as household income also are also found to be important. The classification error of the classification tree shown in Fig. 5 was 22 .76%. As can be observed by the tree, variables relative to mode specific factors are closer to the tree's root and are more significant towards splitting the data in homogenous groups based on their choices. Decreased BSS cost appears to be crucial towards preferring the BSS, as 90% of the pedestrians preferred walking if the cost was higher than 0.75 Euros. Time gains is not found to be an important factor for decision in case of high BSS cost. Although, time gains from trips with the BSS also increase the chances of preferring the BSS over walking. The combination of factors that increases the likelihood of preferring the BSS the most is low BSS cost and increased time gains compared to walking. The random forest model that was fitted for pedestrians had a classification error of 10.57%. The relative importance of the variables that were used in the model is shown in Fig. 6 . Similarly, to the results of the classification tree the most important factor in terms of contributing to decision making is the cost of the BSS with a significant difference from the second, that is time gain from preferring the BSS. Household income, trip duration and the age group of the users follow. Taking into consideration all the results that were presented in Sect. 4, re-emerging or common themes, as well as discrete differences can be noticed. Cost gain, trip duration and IVT gain are the three variables with the greatest importance for all three user groups. Both car and bus users would be willing to use the BSS even if its cost is slightly higher compared to their mode of preference but only if using it is accompanied with IVT gains, while pedestrians would consider using BSS if the cost is low and it offers them trip time gains. OVT gains can be a significant factor towards using BSS for bus users, something that is true to a much lesser degree for car users. If the times it takes to locate a shared bike and get to its location is significantly less than the time it takes to walk to the bus stop and wait for the bus, this time gain is more likely to make them consider using the BSS. Furthermore, the relative importance of the possible cost gain that comes with using the BSS is higher for bus users and pedestrians compared to car users, indicating that for those groups of users keeping the BSS costs as low as possible is a more crucial factor. The above results can have a meaningful impact on shaping the landscape of future urban mobility, in a way that it incorporates a modal share that makes the most out of each mode's strengths. Gaining a deeper understanding of the mechanisms that drive decision making in mode choice can become a powerful tool for policy makers and mobility stakeholders, allowing them to formulate future management policies, operational and business plans in a way that promotes three-pronged sustainability and maximizes each mode's positive externalities. Future research can include further analysis of the willingness to use BSS, by quantifying the effect of the incentives on different user groups and for different trip lengths. An evidence-based approach to physical activity promotion and policy development in Europe: contrasting case studies Bicycle infrastructure and traffic congestion: evidence from DC's Capital Bikeshare Bike-sharing: history, impacts, models of provision, and future China's Hangzhou public bicycle: understanding early adoption and behavioral response to bikesharing Bikesharing in Europe, the Americas, and Asia To be or not to be dockless: empirical analysis of dockless bikeshare development in China Institute for Transportation & Development Policy: The Bikeshare Planning Guide Capital Bikeshare 2011 Member Survey Report Are bikeshare users different from regular cyclists? Inequalities in usage of a public bicycle sharing scheme: sociodemographic predictors of uptake and usage of the London (UK) cycle hire scheme Station-level forecasting of bikesharing ridership Use of a new public bicycle share program in Montreal. Canada Evaluating public transit modal shift dynamics in response to bikesharing: a tale of two U.S. cities Better understanding of factors influencing likelihood of using shared bicycle systems and frequency of use Bike lanes and other determinants of capital bikeshare trips Walking, bicycling, and urban landscapes: evidence from the San Francisco bay area Ridership and effectiveness of bikesharing: the effects of urban features and system characteristics on daily use and turnover rate of public bikes in China Factors influencing the choice of shared bicycles and shared electric bikes in Beijing Providing quantified evidence to policy makers for promoting bike-sharing in heavily air-polluted cities: a mode choice model and policy simulation for Taiyuan-China R: A language and environment for statistical computing dplyr: A Grammar of Data Manipulation rpart: Recursive Partitioning and Regression Trees Classification and Regression by randomForest caret: Classification and Regression Training Models: An Enhanced Version of "plot.rpart ggplot2: Elegant Graphics for Data Analaysis Acknowledgements. This research has been co-financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH -CREATE -INNOVATE (project code: T1EDK-04582).