key: cord-0493902-8r1knjx3 authors: Barbado, Alberto; Corcho, 'Oscar title: Vehicle Fuel Optimization Under Real-World Driving Conditions: An Explainable Artificial Intelligence Approach date: 2021-07-13 journal: nan DOI: nan sha: 2c727d4e8cf2a402974caf49425e04a83a5501a4 doc_id: 493902 cord_uid: 8r1knjx3 Fuel optimization of diesel and petrol vehicles within industrial fleets is critical for mitigating costs and reducing emissions. This objective is achievable by acting on fuel-related factors, such as the driving behaviour style. In this study, we developed an Explainable Boosting Machine (EBM) model to predict fuel consumption of different types of industrial vehicles, using real-world data collected from 2020 to 2021. This Machine Learning model also explains the relationship between the input factors and fuel consumption, quantifying the individual contribution of each one of them. The explanations provided by the model are compared with domain knowledge in order to see if they are aligned. The results show that the 70% of the categories associated to the fuel-factors are similar to the previous literature. With the EBM algorithm, we estimate that optimizing driving behaviour decreases fuel consumption between 12% and 15% in a large fleet (more than 1000 vehicles). Reducing fuel consumption within a fleet of vehicles from a company is critical, since it has impacts on several aspects, such as the economic costs, and for fuels such as petrol and diesel, it also has an impact on emissions. For example, for a company operating in Spain with 100 diesel vehicles that have an average fuel consumption per vehicle and month of 30 liters, the economic cost will be 3930 Euros, taxes included (considering the average price per liter of petrol in Spain: 0.609 Euros, without taxes; 1.31 Euros after taxes as of March 2021 [1] ). Together with that, it also has an environmental impact in terms of emissions (e.g. CO2) principally for diesel and petrol vehicles. Just in Spain, the average monthly fuel consumption for the automation sector (only diesel) is around 1.8M T (in December 2020) [2] . Considering an estimate of 2.67633 Kg of CO2 per liter of diesel [3] , this translates into 4.82M T of C02 emissions each month. It is true that these emissions will be reduced by the transition to hybrid and electric vehicles. However, in US, by 2030, it is estimated that only the 7% of the vehicles will be electric [4] . This highlights the need for finding complementary solutions in the meantime that help reducing vehicle emissions while they are progressively changed into electric ones. The reduction of both economic costs and emissions is achievable by optimizing the fuel consumption of the individual vehicles of a fleet. This is something that, according to the literature, can be done by operating over different aspects that affect the fuel consumed by a vehicle during a route. For instance, [5] indicates how the impact on fuel consumption by aggressive driving can be around the 26% of the total fuel consumed by a vehicle. This means that simply optimizing the driving style of the drivers of a fleet has a significant direct impact on both the economic costs and emissions reduction. The literature analysis on the variables that impact fuel consumption is useful in itself. However, it can be complemented using techniques that can automatically explain for an individual fleet what actionable features are impacting fuel consumption, and how much. This could be helpful for quantifying the potential economic saving and emissions reduction for that particular fleet. A set of techniques that answer this problem are Explainable Artificial Intelligence (XAI) algorithms. Before XAI, there was a dichotomy on whether to use whitebox algorithms or blackbox ones to solve AI-related problems. Whitebox algorithms can directly explain the relationship between input and output features, in exchange of potential limitations on the modelling between those input and output features. For instance, a Linear Regression model is considered whitebox, but the modelling limitation is that the relationship inferred between input and output is linear. On the other hand, blackbox algorithms can potentially infer better relationships between input and output (e.g. by inferring non-linear relationships), but in exchange of not being able to provide clear explanations about those relationships. XAI came to close this bridge by discovering ways to either apply algorithms that explain the relationships in a blackbox model, or by using new whitebox algorithms that can infer complex relationships between input and output. This last case is what happens with Explainable Boosting Machine (EBM). EBM is an algorithm that provides feature relevance based explanations (similar to a Linear Regression model) that allow to see the individual impact of the input features on the output. XAI in general, and the EBM algorithm in particular, is seen as useful within the literature to understand the relationships between a set of input features and an output one. To the best of our knowledge, it has not been studied within the field of vehicle fuel consumption. Using an algorithm such as EBM can be useful for solving the problem aforementioned: understanding the impact that different actionable features have on the fuel consumption of a particular fleet. With that, fleet managers and fleet operators can discover potential ways for reducing economic costs while looking after the environment, contributing to environmental Sustainable Development Goals (SDG), like SDG11 and SDG12 [6] . Following this, the main contributions of our work are: • Use real-world industry data sets that represent different types of vehicles, gathered data from telematic devices connected to the vehicles for more than one year, in order to elicit up to 70 features that may have a potential impact on fuel consumption according to the State of the Art (SOTA). • Design a solution that first trains the EBM algorithm over that input data, and then generates explanations that quantify the potential impact that the input factor may have on fuel consumption. These explanations are combined with business knowledge that aim to align them with prior domain expertise. This solution also includes how to evaluate the explanations from a domain knowledge point of view. • Measure both the predictive power of the EBM algorithm over this real-world input data related to fuel consumption, while also measuring the quality of the explanations in a quantitative manner using prior domain knowledge. • Quantify the potential impact that driving behaviour has in the different vehicle fleets considered through the explanations provided by the EBM algorithm. The rest of the paper is organized as follows. First, we describe the related work in the area of factors that impact in fuel consumption, together with previous works regarding the usage of Machine Learning (ML) within the context of vehicle fuel consumption. Then, we describe our proposal, first explaining the EBM algorithm, and then explaining the system architecture that we will use for generating the explanations and quantify the impact of the different feature groups in the fuel consumption, together with how to evaluate the explanations from a domain knowledge point of view . Following this, we present an empirical evaluation using real-world industry IoT data belonging to different fleets of vehicles. We conclude, showing potential future research lines of work. Fuel consumption can significantly vary from one vehicle to another, even when comparing two vehicles from the same make, model, year and fuel type. This is caused by different factors that may increase or decrease the amount of fuel consumed during the same trip. The literature contains many studies that identify these factors and assess how much fuel could be saved when they are optimized. This is something very relevant for fleet managers. [7] presents a literature review of different factors that have a potential impact in the fuel consumption of a vehicle, together with their relative importance. Figure 1 shows the categories of fuel factors considered in that review. The first category considered are travel-related factors. This group includes factors that are related to the route covered by the vehicle. In fact, the authors mention eco-routing as a crucial aspect to reduce fuel consumption. Fuel can be saved by choosing an optimal route not only in classical terms of distance and travel time, but also in terms of a route that saves fuel compared to other possible ones (e.g.choosing routes with less "bumps" or "slopes"). In fact, the new route may even be longer in time or distance, but offers fuel saving. The paper indicates that eco-routing alone can reduce the fuel consumption of a vehicle by 18% to 23%. The second category includes weather-related factors. These factors impact the fuel consumption of a vehicle in an indirect way (i.e. by being related to the usage of the air conditioner, by affecting the water pump, by increasing the F I G U R E 1 Categories of fuel factors discussed in [7] engine or transmission friction in a cold weather...). Thus, this category includes factors like the exterior temperature, the relative humidity or the wind effects. These factors may be responsible for about a 1% of the fuel consumption of a vehicle. The third group of factors are named vehicle-related. It includes factors mainly related to the engine and the vehicle itself, such as vehicle load, vehicle speed, engine speed, type of fuel, whether the vehicle has an exhaust after-treatment system or not... The fourth group is named roadway-related factors. It refers to factors related to the road condition, like the road slope, the surface roughness, or the road curvature. These factors, though not being very actionable (sometimes it is difficult to prevent them), have a large impact on the fuel consumption (around 5 to 20%). The fifth group of factors refer to traffic conditions. They are very related to a good arrangement of traffic signs, such as traffic lights. They have the potentially biggest fuel impact (around 22 to 50% of the fuel consumption). Finally, the sixth group mentioned in the review are the driver-related factors, like the driving behaviour or the aggressiveness of the driving. The driving profile of a particular driver (that measures aspects such as that driving aggressiveness), are calculated with vehicle information such as the RPM (engine speed; revolutions per minute), the speed or the acceleration. The authors mention how aggressive driving can be responsible for up to 40% of the fuel consumption of a vehicle when compared to a calmer driving style. The aforementioned literature review is enhanced by the study of [5] . Here the authors present a thorough analysis regarding the influence of different factors for fuel consumption in a vehicle, along with the influence for CO2 emissions. This study considers passenger vehicles under real-world operating conditions. Regarding fuel consumption specifically, the authors offer a summarized view of the literature showing different categories of variables and their proportional impact in the fuel consumption of a vehicle. There are two approaches for analysing the impact of a specific factor in the fuel consumption of a vehicle. First, using a simulation analysis that studies the isolated impact of a factor under laboratory conditions. Second, by analysing feeds of data that contain the instant fuel consumption reported during trips on real-world environments. These feeds of data can be gathered from sources such as OBD-II (On-board diagnostics) port [8] (e.g. the Engine Fuel Rate with the PID 015E). The analysis of the literature highlights that both approaches offer in general similar results (when there are publications available for a specific factor both from the simulation point of view, as well as with the real-world data). Thus, real-world collected data can be a valid data source for assessing the impact of different factors in the fuel consumption of a vehicle. TA B L E 1 Fuel factors mentioned in the literature, together with the relative importance as reported by [5] Here, the literature review proposes a fuel factor taxonomy that in some cases matches directly the one proposed in [7] , but in some others is different. There are 28 factors than can be classified into 9 groups. All these factors, as reported by [5] , appear at Table 1 . This Table shows the relative importance of each of the factors (literature median value) along with an interval that encloses the different values reported, considering vehicles under real-world operating conditions. It also shows how many papers talk about that particular factor, as well as the distribution of the relative values reported. Regarding driver-related factors, Table 1 shows a group called driving behaviour/style that accounts for factors related directly to the driver. It is almost similar to the one from 1 with the exception of considering factors related to good driving styles that may reduce the fuel consumption. Regarding the group road conditions in [7] , it mainly includes the travel related, traffic related and roadway related factors. Vehicle-related is the group with more factor's differences between both papers. Compared to [7] , these factors are split into auxiliary systems, vehicle conditions and fuel characteristics, complemented with other groups that include factors related to the vehicle's design itself (aerodynamics and operational mass) and to certification test margins. In this last review, all these vehicle-related factors account for aspects related to the vehicle itself, not considering anything directly related to the driver. This is a difference when compared to the taxonomy of [7] , because vehicle-related includes acceleration and speed factors. The difference between the analyses shown in both articles are not only in terms of the taxonomy proposed to group factors, but sometimes also regarding the reported impact (i.e. exterior temperature has a median impact reported value of 10% at [5] against the 1% impact for all weather related causes reported by [7] ). TA B L E 2 Reduced view from the factors of [5] , focusing on some of the actionable variables that can be retrieved from the OBD-II. The upper and lower limits refers to the minimum and maximum SOTA values reported in the review. For Rain, the lower limit is set to zero since the review does not provide limits for that feature. Within this last taxonomy of features that affect the fuel usage of a vehicle, some of them could be considered as "actionable", thus, they could be changed in a particular vehicle; in some cases without even needing to change the vehicle's route. An example of this is the aggressive driving style. Other features are inherent to the vehicle and cannot be directly changed, like the vehicle make/model or the vehicle mass. Even within the "actionable" features, some of them cannot be easily read through OBD-II (e.g., if there are roof add-on, which affects the vehicle aerodynamics). Thus, a subset of these features that considers only the ones that are "actionable" and the ones that can be read is the one shown in Figure 2 . The physical reasons as to why these features impact the fuel usage are: • Air conditioning (A/C): Using A/C increases the energy supply needed, leading to an increased fuel consumption. The time using the A/C and the power needed will increase/decrease that extra energy required. This category also includes the heating systems and related features, like the vehicle's coolant. • Steering assist system: These systems help driving safely and more confortable, but require additional electrical supply in exchange. An example is the usage of Electric power assisted steering (EPAS). • Other vehicle auxiliaries: These features include other auxiliary elements of the vehicle that may also require an extra energy. An example is the vehicle lights usage, that require extra energy and due to that, extra fuel. • Rain: Rain (and snow) impact the fuel usage in different ways. First, they affect the wheel gripping to the road surface. Also, the wheels have to push through an additional layer of water (or snow), so extra energy is required. • Ambient temperature: Temperature affects tyres, motor oil viscosity, cold start engine. . . Extra fuel is required in low temperatures to warm up the engine. It also affects aerodynamics: increased air density and higher aerody-namics resistances. • Aggressive driving: Aggressive driving is shown through different variables: acceleration patterns, gear change, harsh turns, harsh brakes, speeding... The impact on the fuel usage could be high. • Eco driving: Eco driving is related to the optimal driving of a vehicle, which may reduce its fuel usage. It involves optimizing the gear shifting (related to the usage of cruise control), choosing the best possible route thanks to a navigation device... • Lubrication: Overcoming of friction within the vehicle's components requires energy, and this is related to the fuel usage. If the friction is minimized thanks to an adequate lubrication, the energy required will be lower. • Tyres: Tyre pressure is related to the rolling resistance coefficient (RRC). When the tyres have low pressure, the contact surface with the road increases and more energy is needed to rotate the wheel (as the friction increases). • Vehicle extra mass: Extra mass in a vehicle (measured, for instance, in additional 100Kg), increase the energy needed to move the vehicle. This may happen for instance when there are additional passengers in a vehicle. • Altitude: In higher altitudes the air density is lower, so the air resistance that the vehicle faces while driving is also lower. This means that in higher altitudes the vehicle needs lower energy to move the same distance. • Driving uphill: Driving uphill adds an extra load over the vehicle, that needs additional energy to move. By contrast, driving downhill reduces the amount of energy needed. • Road roughness: For instance, if a road has many bumps, the vehicle will need additional energy to go through it. • Traffic condition: Traffic condition also impacts in the fuel usage. For instance, if there are traffic jams, the idling time normally increases, leading to an increased average fuel consumption. • Trip type: The trip type also impacts in the fuel usage. For instance, if the trip distance is small, the average fuel consumption will increase, since fuel is required to turn on the vehicle. There are some additional factors that impact in the fuel consumption that the previous references did not mention. This is the case of Diesel Exhaust Fluid (DEF). DEF is an urea-based product used in after-treatment processes of the vehicle, such as Selective Catalytic Reduction (SCR). It is applied over the vehicle's exhaust stream in order to transform the NOx gas emissions into nitrogen, water and CO2, reducing the NOx emissions in the process [9] . Techniques like SCR do not only reduce the emissions of a vehicle, but also help the engine performance and may lower fuel consumption [10, 11] . The factors already mentioned are linked to passenger vehicles, but for other vehicles, such as trucks, there are additional ones to consider. This is the case of power take-off, where there is power from the engine that is taken out (e.g. with a splined drive shaft) and used in another application (e.g. for a cement mixer in a truck). This directly impacts in the mileage of a vehicle [12] . All these references show that there is a physical and empirically measured connection between the value of specific factors and the value of the fuel consumption. Thus, it is possible to use them in order to predict the value of the fuel consumption with ML models, as already shown within the literature [13, 14, 15 ]. As we mentioned in the previous subsection, there are several features that affect the fuel consumption of a vehicle. This can be measured using as input data source the feeds of data gathered from the vehicle's movement together with Machine Learning (ML) algorithms. This is the case of [16] , where the authors conduct an study over a fleet of vehicles where they assess the impact of driving behaviour in the fuel consumption. They consider features related to driving behaviour, such as the gas pedal position, the speed and speed variance, or the steering angle, and they first see how those features have significant correlations with the fuel consumption. Then, they use several clustering algorithms (Spectral clustering, KFCM, K-Means), finding different clusters based on the driver consumption profile and its relationship with those driving behaviour features. In [15] , the authors analyse the impact of other features for fuel consumption within the context of trucks. The 56 features used include characteristics from the vehicle, such as its gross weight, together with others belonging to driving behaviour (usage of cruise control, average speed...), as well as information from the road (like the road surface macrotexture, or the curvature of the road). Those input features are seen as correlated with the fuel consumption (using a bivariate correlation analysis), and then are used to train several ML models (ANN, SVM, Random Forest) in order to predict the fuel consumption of the trucks. For the case of Random Forest, the authors viewed the relative impact from the different features in the fuel consumption through their contribution for accuracy during the tree splitting process. The previous approaches are useful for detecting dependencies between a set of features and the fuel consumption of a vehicle. However, they do not quantify exactly how many extra liters of fuel are spent due to those features. In [17] , the authors investigate the impact of eco-driving in the fuel consumption. Eco-driving is expressed through several features related to variables such as the Revolutions Per Minute (RPM) or the braking. Then, they use statistical tests for detecting significant decreases in fuel consumption when an eco-routing driving style is used. Then, they use a Logistic Regression model for analysing the relationship between driver-related features and the fact that the vehicle trip was actually done with eco-routing. It is possible to use a Linear Regression model for measuring the individual impact of input features in fuel consumption, and know exactly how many liters are used due to each individual variable. The reason behind this is that those models are known as whitebox because they directly provide the influence of the input in the output [18] . This is shown in [19] , where the authors predict the fuel consumption gap between type-approval tests and real-world driving trips, using the information of one vehicle during one year, and with 20 different drivers. With that, they build a multiple linear regression model that takes into account driver-related factors as well as environmental and traffic factors in order to predict the fuel consumption gap. Through these linear models, they provide the relative importance for each of the features in the fuel consumption, as well as the r2 value for each of the models tested in order to evaluate them. Similarly, in [20] the authors study the impact on the fuel of several features inferred related to driving behaviour through the analysis of the data from two different vehicles. One of these features is the Driving Style Indicator (DSI), which is the difference between the average positive acceleration of a vehicle minus the average of the negative acceleration divided by the average speed. The relationship between these features and fuel consumption is modeled through linear regression algorithms in order to quantify the impact of each one of them. Even though linear regression models can be used for fuel prediction when there is a need of a whitebox ML algorithm that explains the relationship between input and output, this limits the results since the relationship inferred is linear. This problem can be solved by using non-linear whitebox models, such as Generative Additive Models (GAM). These models, instead of modelling the relationships between the input features and the output value though a con-stant coefficient, they infer individual and additive relationships through non-linear functions. This has been proved useful in other domains. In [21] , authors identify risk factors and interaction effects in order to to predict intensive care admission in patients hospitalized with COVID-19. In [22] , authors use GAM models to predict goals in soccer along with the quantification of the impact of the different input factors on the output. However, to the best of our knowledge, these models have not not been used for both predicting fuel consumption along with the quantification of factors that impact on it. In this Section we describe the XAI whitebox algorithm considered in this work for obtaining the explanations, as well as the logic used for generating them. We also include the schema for the whole process, and the steps involved for analysing and evaluating those explanations. EBM [23] is a type of whitebox model that provides feature-relevance based explanations. It can be used for both regression and classification tasks, and similarly to other whitebox models, such as Linear or Logistic regression algorithms, it infers an independent relationship between input features and the output variable. Because of that, it is possible to know the individual contribution from those input features for a particular output value. The advantage of EBM is that it provides the option to infer non-linear relationships, and due to that, it can potentially increase the model generalization [24] . EBM is based on the G A 2 M algorithm [25] , but with a difference in terms of computation performance. EBM is an evolution from Generalized Additive Model algorithms (GAM) [26] because not only it is able to model individual relationships between the input features and the output, but it can also model pairwise interactions between input features, and include them as additional terms. The expression for the EBM algorithm appears in Equation 1 for the regression case. In that Equation, n=1 f i (x i ) represents the different functions that model the individual relationship between a specific input feature x i and the output y through a specific link function g . Similarly, n=1 f i j (x i , x j ) represents the pairwise function term that models the relationship between two input features x i , x j and the output g (E [y ]). Finally, β 0 represents the intercept that adjusts the prediction from the model. For the sake of simplicity in both the training and the explanations generated, we have not considered pairwise interaction terms for the analysis carried out in this paper (the hypeparameters of the model allows to choose whether or not to include them). In this Subsection we show the full process followed for obtaining the explanations and for evaluating them later. This schema appears in Figure 2 . Using the databases with vehicle information (described in more detail in Subsection 4.1), we both identify fuel outliers and we also train the XAI whitebox model that infers the relationship between input features and fuel consumption. F I G U R E 2 Schema for the process of aggregating the explanations in order to compare them with the SOTA. Regarding the anomaly detection step, it detects vehicles and dates where the average fuel consumption is anomalous, considering the other vehicles from the same model and for the dates that are associated to the same route type. That route type is the primary route type of a vehicle in a day (highway, city or combined). The algorithm applied is described in Equation 2. Using the whiskers from a boxplot analysis, the vehicle-dates with anomalous fuel consumption are detected with an univariate approach considering the average fuel consumption from other vehicles from the same model and route type. This approach is useful since it directly provides a fuel threshold that indicates the amount of fuel that is anomalous for a particular vehicle. In this paper we will only use the upper limit for the context of the explanations, since it is the one that identifies high fuel consumption. the An example of these limits appear in Figure 3 for one vehicle. We see that the different dates are categorized as "city", "combined", and "highway", and there is a threshold that highlights the amount of fuel that in some of those days is anomalous for that vehicle. The information provided by the anomaly detection step will be used later on the evaluation step to analyse how much of the anomalous extra fuel is covered by the XAI explanations. Regarding the XAI model, after training it, we get its raw explanations (Daily Explanations step). There are several combinations of vehicle-date-feature where we have the feature relevance for each feature, for each vehicle and for each date. However, since the functions that express the relationship between input features and the output are not linear, we calculates the average fuel reduction for each vehicle and each date if every one of those features changed from its actual value to a reference value (e.g. if the tyre pressure increases from its actual value to the median value that it usually has for the vehicles of the same model). An example of this appears in Figure 4 . F I G U R E 3 Output example for the fuel anomaly detection for one vehicle. It shows the maximum limit for that vehicle model with respect to the three route types (city, combined, highway), highlighting the fuel part that is anomalous because is above those limits. F I G U R E 4 Example of the explanation generation. The first image shows the feature relevance for trip kms, and the second one for harsh turns. The explanations for points 1 and 3 corresponds to the feature relevance difference with respect to the reference points 2 and 4 respectively. An example of the explanation output appears in Table 3 , where there is a row for each vehicle (vehicle_id), date (date_tx) and feature (e.g. the number of jackrabbit events, count_jackrabbits). The variable vehicle_group indicates the id associated to the group (make, model, year and fuel type) for that vehicle_id. route_type indicates the type of route for that specific date ("city", "combined" or "highway"). Complementing this, "avg_fuel_consumption" indicates the average fuel consumption for that vehicle in that date (L/100Km), and limit_group the threshold limit that identifies when the average fuel consumption for a particular vehicle is anomalous (L/100Km). Along with that, it includes the intercept. Then, feature_relevance contains the feature relevance for that vehicle-date, and feature_value its corresponding value. target_value shows the recommended value for changing that particular feature, and following that, y_diff shows the average fuel consumption (L/100Km) that would be reduced by doing that. y_fuel_new shows the new average fuel consumption (L/100Km) that would be achieved by applying all the recommendations from the explanations in a particular day. A final note is that since the explanations show potential fuel savings when we change a feature value to a target one, the feature reduction will be obtained in the same one regardless of whether we need to increase the feature value or decrease it (ash shown in Figure 4 ). After that previous step, we apply several business rules for filtering some of the explanations generated (Explanation Filter step). These rules are: • BR1: The features used for training the model may be numeric (e.g. time driving uphill) or categorical (e.g. the vehicle model). All those categorical features are one-hot encoded before training the model. However, they are not considered for the explanations since they are not actionable. • BR2: We remove the features in the vehicle-date explanations that have a very low impact on the fuel consumption (relative impact below 1%) • BR3: The explanations only include vehicles where the average fuel consumption is above the value of the median inlier vehicles for the same model and on the same route type. • BR4: Feature values must be higher than the median value of the vehicle inliers from the same model for that same feature when the feature Type is Positive, or lower when Type is Negative. • BR5: The total fuel reduction from the explanations should not be more than the 80% of the original fuel consumption. Since EBM does not allow to impose restrictions in the learning for the individual models for the features, we need to apply this posthoc filtering to remove explanations that are not physically possible. Finally, the schema shown in Figure 2 is what some authors have identified as the property of being "consistent with a priori beliefs" [27, 28] . Following this, first, we use the individual explanations generated by the aforementioned solution, and after applying BR1, BR2 and BR3, we aggregate the relative impact in the average fuel consumption following the categories described in Table 2 from [5] in order to see if the impact per category is aligned with the SOTA (Evaluation 2 in Figure 2 ). Second, we compare the new average fuel consumption (after applying the recommendations from the Daily Explanations step) to the catalog fuel reference for vehicles of the same make, model, year, fuel type and on that specific route type (city, combined or highway). The intuition behind this is that if we remove the extra fuel due to driving behaviour, traffic conditions... the vehicle's fuel should be close to its catalog reference. The databases that we use to get this catalog fuel reference are [29] , [30] , [31] and [32] . A consideration to take into account is that there may be many entries in the databases for a same make, model, year, fuel type and route type. In this cases, we use as catalog reference the median fuel value over all those entries. We use the XAI proposal described in Subsection 3.2 different industry data sets to evaluate the hypotheses described below. For all these hypotheses we use as source of information the feature-relevance explanations yielded by the XAI algorithm, since they account for the individual impact from the different features in the fuel consumption. The hypotheses are connected to the evaluation steps mentioned before in Figure 2 . Regarding the "Model Evaluation" step: • Regarding H1, we include a comparison against other blackbox models, as well as a comparison against simpler whitebox ones. Particularly, we include a Linear Regression model with the usage of the ElasticNet [33] algorithm as the whitebox alternative, and for black box models, we use the following tree based methods: XGBoost [34] and Light-GBM [35] . For the "Domain Knowledge Evaluation" (Evaluation 1) step: • Hypothesis 2 (H2): The relative fuel impact explained for the different feature subclasses from Tables 10 and 11 is between the literature limits shown in [5] , or at least below the maximum limit. As for the "Domain Knowledge Evaluation" (Evaluation 2) step: • Hypothesis 3 (H3): The relative fuel impact explained for the vehicle-dates with anomalous fuel consumption is not significantly lower than the relative extra fuel detected by the outlier detection algorithm. • Hypothesis 4 (H4): The new average fuel consumption after applying the recommendations from the "Daily Explanations" step is similar to both the catalog reference for that same vehicle's make, model, year, fuel type and on that route type. It will aso be similar compared to the median historical value of the vehicle's from the same group without fuel anomalies. This will be measured in terms of the MAPE against those reference values. The reference values for MAPE in H1 and H5 are the ones that appear in [36] . Those reference values are originally expressed for forecasting models, but we will use them for this use-case of regression, since MAPE is a metric also used for regression models [37] . • The data source is the real-time feed of data retrieved by a telematics device connected to the to the on-board diagnostics (OBD) port. Particularly, we retrieve the data from devices connected to OBD-II port, since they allow an easy and direct retrieval of relevant features, such as the fuel consumption through the Engine Fuel Rate with the Parameter ID (PID) 015E [8] . A sample of these raw data with a csv structure can be seen in (Table 4) , and is also available at [41] . Over the FAR, we apply several quality assurance analysis, where we discard records that are not meaningful or that may not be useful for training the model. The criteria followed is: • Remove records with missing trip distance, or with a low trip distance in that day (Km < 5). • Remove records with missing fuel, or with a too small fuel consumption, according to Equation 2. This Equation is used first for discarding vehicles with an average fuel consumption that is too low (below the lower limit). Then, after the Noise Elimination step, it will be applied again with the remaining data for obtaining the upper limit for the univariate anomaly detection step from Figure 2 . • Remove records with a potential wrong fuel value due to being extremely high. Besides having fuel anomalies that correspond to certain feature values, there is also noisy data regarding fuel consumption that needs to be eliminated in order to help the training of the ML model. We also use Equation 2. This data points are also removed before computing the final limits for the anomalous fuel consumptions that are not noisy data at the univariate anomaly detection step from Figure 2 . Finally, we fill the other missing values with the median historical value from the fleet using vehicles from the same make, model, year and fuel type to fill the missing feature value of a particular vehicle on a particular date. This process is done for different fleets independently, resulting in an individual data set for each one of them. With that, we use 9 data sets from different fleets, as indicated in Table 5 . These data sets contain different types of vehicles that are identified with two groups of variables. The first one is the vehicle's make, model, year and fuel type. Since fuel consumption depends on the type of vehicle (among other things), we use the Vehicle's Identification Number (VIN) to identify those variables. With that, we get the different types of models that appear in column "N models". Along with that, since some models may have similar fuel consumption, we add an additional variable, named vehicle class, that groups together those vehicles (e.g. "Large Pick-Ups"). This vehicle class is inferred directly from the D1 1551 16 218038 5770 Large 1479 34 1 37 0 0 0 0 0 0 0 D2 1564 16 120605 1809 Large 201 697 75 588 0 3 0 0 0 0 0 D3 316 44 65285 10475 Medium 243 5 1 9 10 5 13 9 21 0 0 D4 252 14 35283 1915 Medium 4 178 61 9 0 0 0 0 0 0 0 D5 165 20 22402 724 Medium 165 0 0 TA B L E 6 Vehicle classes according to their average fuel consumption, as appears in [42, p. 18] historical mean average fuel consumption, following the classification Table 6 that appears in [42, p. 18] . With that, we are conducting the analyses over fleets of vehicles that are different among themselves, in order to provide results that are as general as possible. Those fleets include passenger fleets (such as D1) of vehicles, as well as heavy-duty vehicles like trucks (such as D3). We are also covering different fleet sizes, such as "Large/Enterprise" (D1 and D2), "Medium" (D3, D4, D5 and D6), and "Small" (D7, D8 and D9), following the categorization of [43] , where fleets with more than 500 vehicles are considered "Large/Enterprise"), fleets between 50 and 499 "Medium", and fleets with less than 49 vehicles "Small". This is indicated in column "Size". Finally, column "N outliers" indicate the vehicle-dates with fuel anomalies, according to the univariate outlier detection from Section 3. As already mentioned, the features considered for each of those data sets appear in Tables 10 and 11 . Column "Name" includes a descriptive name for each of the features, and column "Description" contains a descriptive text about each of them. "Unit" indicates the metric units associated to each of the features, and "Notes" contains a description about some of the variables and why they may impact in fuel consumption (particularly for the ones that are not trivial). The column "Type" shows the type of impact that those features have in fuel consumption. If the type is "Positive" it indicates that increasing that feature value will normally increase fuel usage. An example of this is the number of events with high RPM; more events lead to more fuel consumption. On the contrary, if the type is "Negative", it indicates that increasing that feature value will normally decrease fuel usage. An example of this is the time using speed control; more time using it should lower the fuel consumption (versus not using it). Another example is the tire pressure; when they decrease, the fuel used will increase. Column "Reference Zero" indicates the columns that in order to see the impact in the fuel consumption are set to zero. For instance, for obtaining the feature impact for a variable like "rpm_high", this variable is set to 0 for calculating the reduction in the fuel consumption due to it by seeing the decrease with respect to the current feature value. For the remaining features the reference is, Daily categorization of route types based on the trip distance (Km) and per_time_city for the data set D1. by default, the median value for that feature over the vehicles with fuel inliers from the same vehicle model. Finally, columns "Category" and "Subcategory" refer directly to the same columns from Table 2 from [5] . The columns that do not have a value in both of these columns are columns that are not features used for explaining the fuel (they are relevant for the data set, and some of them are even used in the model, like the vehicle model, but they are not used for explanations). Among these columns is the main driving context detected for each day ("route_type"). This is calculated as follows: • IF per _t i me_ci t y ≤ l ow _t h_t i me AND t r i p_k ms ≥ t h_k ms THEN r out e_t y pe = hw y • ELSE IF per _t i me_ci t y ≥ hi g h_t h_t i me AND t r i p_k ms ≤ t h_k ms THEN r out e_t y pe = ci t y • ELSE r out e_t y pe = combi ned With t h_k ms = 30, l ow _t h_t i me = 0.5 and hi g h_t h_t i me = 0.65. Thus, we categorize each vehicle-date with a particular route type that may be "city", "highway" or "combined", depending on the total trip kms (trip_kms) and the value of the variable per_time_city. An example of this route type categorization, using the threshold values aforementioned, appears in Figure 6 . The categories considered are "Auxilary Systems" (for all the features that imply an additional electrical energy consumption), "Driving Behaviour" (driver-related features), "Road Conditions", "Vehicle Conditions" and "Weather Conditions". Regarding "Vehicle Conditions", we have included additional variables within the "Other" subcategory with respect to [5] (e.g. the additional fuel consumption when the DEF level is low), so it does not match the ones covered in that review. Because of that, this subcategory will not be used for checking the hypotheses already mentioned. For "Rain" subcategory, since the review only provides one reference value, we will only check H1 (since there are no limits per se). Finally, the dependent variable is the average fuel consumption, calculated as follows: av g _f uel _consumpt i on = t r i p_f uel _used t r i p_k ms × 100 We divide the Results Subsection into three parts: first, regarding the model evaluation analysis, and second, regarding the evaluation of explanations within the context of the domain knowledge. We end with an analysis of the potential impact of the explanations for both fuel saving and emissions reduction. Considering a train/test split for each data set of 90/10, we get the results shown in Table 7 . As we can see, D1, D2, D4, D5, D6, D7 and D8 are within the "highly accurate forecasting" category, while D3 and D9 are within the "good forecasting" one. Thus, the model is able to infer sufficiently good relationships between the input data and the fuel consumption, and H1 is validated. Because of that, it can be used for extracting explanations in order to evaluate the remaining hypotheses. In fact, the explanations for all those fleets over that 4 months period yielded explanations for 74971 vehicle-dates. When considering vehicles with a median MAPE value over the test set if "Good forecasting" or better, the explanations covered corresponds 96%, and 76% when considering vehicles with a "Highly accurate forecasting". Complementing this, in Table 7 we see the adjusted R2 value on the test set for each data set. Every data set is within the "substantial" R2 category, with the exception of D4 and D8. Compared to the other models, we see how EBM improves the ElasticNet model for every metric and in every data set. Regarding LightGBM and XGBoost, EBM is similar for most fleets and for both metrics. In some cases, like D7, EBM is actually better than the other two models for both metrics. In D1, EBM is better in terms of MAPE than LightGBM. For evaluating the explanations, we focus on the 4 months of data where the winter period is included (in order to be able to assess the impact of the ambient temperature). Using the models, we get the explanations for each vehicledate for that period of data, and we aggregate the median feature impact values per subcategory and per vehicle fleet. The median results regardless of the fleet appear in Figure 8 , and the median results considering fleet and including the limits from the SOTA appear in Figure 7 . For the analyses, we have considered only the vehicles that have a median F I G U R E 7 Median feature impact per Category-Subcategory-Fleet and the corresponding limits from the literature [5] MAPE over the test set of "Good forecasting" or better (unless otherwise indicated). In Figure 8 we see how the absolute relative impact for each subcategory is below the limits show in Table 2 . This is shown clearer in Figure 7 , where we see that for all the data sets and for all the feature subcategories the relative impact is below the reference values. Complementing this, in Table 2 we see what combinations of subcategories-data sets are within the limits from the literature. For 44 combinations, out of the 77 (without the Subcategories of "Other" and "Rain", as mentioned before), the feature relevance is within the limits from the SOTA. The remaining 33 that are not within the limits is because they are either lower than the minimum value used, or higher (for Tyres and Altitude in D3, Lubrication in D4 and Tyres in D5). "Aggressive Driving", "Eco-Driving", "Trip Type" and "Road Roughness" are the Subcategories that are both common in all data sets while having an aggregated feature impact that is within the literature limits. Others, such as "Steering Assist Systems" and "Vehicle Extra Mass" are also fully within the limits, but they are features that are relevant only for some data sets. With Figure 8 we see the individual impact per vehicle and date, for all the data sets considered together. As the Figure shows, "Other Vehicle Auxiliaries", "Steering Assist Systems", "Aggressive Driving", "Eco-Driving", "Vehicle Extra Mass", "Lubrication", "Road Roughness", "Trip Type" and "Ambient Temperature" have a median value per vehicle-date that is within the limits from the SOTA. For some Subcategories, such as "Steering Assist Systems", "Vehicle Extra Mass", "Driving Uphill", "Road Roughness", "Traffic Condition", and "Ambient Temperature", the upper whisker value from the boxplot is also within the SOTA limits. In general, all the median values are at least below the upper limits, with the exception on "Altitude" and "Tyres", which are overestimated by the model. We also see that even though the impact per Subcategory normally does not exceed the upper values reported, there are data points where the impact is above the thresholds from the literature. With that, H2 is validated for the 73 out of the 77 combinations of subcategories-data sets since they have an influence in the fuel consumption because the relative impact is always below the maximum SOTA values, and in some cases, its even between them. F I G U R E 8 Fuel impact per vehicle-date for each fuel factor subcategory. For H3, we analyse the extra fuel explained through XAI with respect to the extra fuel indicated by the limits generated by the outlier detection method from the previous Section. The intuition behind it is that even though the XAI whitebox algorithm is not trained over all the potential causes that may impact in the fuel consumption, it is enough to explain at least that extra anomalous fuel. In Figure 9 we see that the comparison between the relative extra fuel explained by the XAI method versus the extra fuel shown by the outlier detection algorithm for each of the models within every fleet data set. We see that, in fact, for the majority of the cases the extra fuel explained is actually superior than the extra anomalous fuel detected. Finally, the column "% below catalog" show the percentage of vehicle-dates that are receiving a recommendation that turns the average fuel consumption (L/100Km) below the catalog reference (with an offset of 1 L/100Km). This metric should be minimized, because the target fuel should not be below the catalog reference (is a value that is not physically reachable). This indicates data points that the model is nor explaining properly (its overestimating the potential fuel reduction). The best cases are D1, D4, D7 and D9 where this metric is 3.3%, 1.9%, 4.0% and 0% respectively. The previous analysis can be enhanced by checking the fuel reduction considering each vehicle model and route type with respect to the catalog reference. In the case of D1, we explicitly had the vehicle's makes (so it was not needed to retrieve them from the VIN decoding process). Because of that, we obtained there exactly catalog fuel consumption from [44] and [45] . The results appear in Figure 10 . There, we only see 4 cases where the new average fuel is below the catalog reference. With the previous analyses we checked that EBM is indeed suitable for both model and explain the relationship between fuel factors and fuel consumption, since model metrics are good enough, and domain knowledge metrics are in general aligned with prior domain knowledge. This was needed for conducting an analysis on the impact of fuel factors in order to quantify how much extra fuel is spent due to actionable features. For this analysis we focus on driving behaviour features, since they are among the features that have more impact. They are also actionable, because they are mainly associated to the drivers and can be potentially acted upon without changing other contextual F I G U R E 1 0 Average fuel consumption (L/100Km) for the vehicle's models from data set D1 in each route type (city, combined, highway), before and after applying the recommendations, and compared to the catalog reference. . factors, such as the planned routes. Figure 11 shows the fuel consumption on each of the fleets over the four months considered, together with the extra fuel consumption from driving behaviour, and the extra fuel consumption due to the remaining factors. Focusing on D1 (since it is the fleet with more information and that provided better results), we see that the relative impact of all the features is between 17% and 23%, and for driving behaviour only, it is between 12% and 15%. Taking as an example the month of February, we see that there are 14631 extra litres spent due to driving behaviour. Reducing it would have a positive impact both in the expenses from the fleet, as well as in the environment. Since the vehicles from D1 are mostly diesel, using the conversion to CO2 from [3] , where 2.67633 Kg of CO2 are emitted per liter of diesel spent, the extra CO2 emissions in one month due to driving behaviour is between 19930 and 39157 Kg. The main libraries used for the work done in this paper are the following: • EBM [46] . We used the default parameters from the library for all the analyses. • Hypothesis contrasts [47] • XGBoost [48] • LightGBM [49] • ElasticNet [50] F I G U R E 1 1 Monthly fuel consumption (L) for each fleet over the different months, along with the part of that fuel that corresponds to the extra fuel due to driving behaviour, together with the extra fuel from the remaining factors. First of all, our proposal studies the influence on fuel usage for petrol and diesel vehicles altogether. The independent analysis for petrol and diesel vehicles may yield different results. Also, we do not cover hybrid vehicles within our study. Though we focused on actionable features for analysing the impact on the fuel consumption through XAI, there are other features that could be elicited. Regarding the EBM algorithm, we only used the individual feature relevance of each variable for building the recommendations, not considering possible pairwise terms if they exist. Finally, as we saw for the case of harsh turns in Figure 4 , the relationships between an input feature and output are not monotonic. It would be interesting to analyse how the results differ when applying monotonic constraints. We see several possible lines of work following this paper. First of all, the set of variables used could be enhanced. Even though we use up to 70 features per data set, not all the subcategories mentioned within the SOTA are covered. For instance, we do not use any feature that measures the usage of a trailer towing or roof racks. Also, the analyses could be complemented with the usage of other XAI feature relevance techniques for generalizing and comparing the results obtained. Finally, even though the domain knowledge is considered before the model training and explanation generation when eliciting the features, for the remaining steps (business rules for filtering the explanations, metric analysis...) is something that is being applied post hoc. Applying all the knowledge before training the model could potentially yield better results. In this paper, we used Explainable Artificial Intelligence (XAI) through Explainable Boositng Machine (EBM) to measure the potential impact that actionable factors may have on the fuel consumption of diesel and petrol vehicles. EBM are whitebox Machine Learning (ML) models that have good model performance while having a good degree of interpretability. To achieve that, we worked with real-world industry data sets that represent different types of vehicles, from passenger cars to heavy-duty trucks. We have gathered data from telematic devices connected to the vehicles for more than one year. With this source data, we have elicited up to 70 potential fuel factors based on the literature in order to build a data set that can be used for both model and explain the relationship between those factors and fuel consumption. Then, after training an EBM model on each one of those data sets, we have proposed an algorithm to generate the explanations and quantify the potential impact of those fuel factors, while combining the explanations with business rules that contribute to align the explanations to prior domain knowledge. After that, we have evaluated the quality of the model from two points of view. First, we have checked that the algorithm was able to properly model the relationship between input factors and fuel consumption analysing its model performance. At this point, we have also compared the performance against other well-known ML models with high predictive power that do not directly provide explanations (blackbox algorithms). We saw that EBM achieves good results that are similar to those other models. Second, we have evaluated the explanations generated versus prior domain knowledge from the State of the Art, in order to see if the explanations are meaningful and if they are aligned with that knowledge. We have seen that in general they are aligned, with particular good results for factors related driving behaviour, operational mass and trip type. With that, we proceeded to quantify the potential extra fuel that those vehicle fleets have across different months both in general and also specifically considering driving behaviour factors. For some of the vehicle fleets with more data, we saw potential impacts due to driving behaviour of around 15%, highlighting both the extra economic costs and the environmental impact that the fleet is having due to inefficient driving. In this paper we followed the literature regarding factors that impact in the fuel consumption of a vehicle, and we used XAI in order to see if it is possible to explain and quantify that impact with these techniques. For that, we used Explainable Boosting Machine (EBM) algorithm, a type of whitebox Machine Learning model that yields explanations in terms of feature relevance. We trained the model with a set of up to 70 features in order to predict the fuel consumption of diesel and petrol vehicles, using several real-world industry data sets from very different types of fleets (passenger cars, trucks...). Then, we generated explanations combining the information provided by the EBM algorithm and the feature taxonomies from the SOTA regarding the factors that affect fuel consumption. This research was done within the context of the registered patent [51] for LUCA Fleet at Telefónica. We thank Pedro Antonio Alonso Baigorri, Federico Pérez Rosado, Raquel Crespo Crisenti and Daniel García Fernández for their collaboration. Review of in use factors affecting the fuel consumption and CO2 emissions of passenger cars Sustainable Development Goals A review of vehicle fuel consumption models to evaluate eco-driving and eco-routing Road vehicles -Diagnostic systems -Keyword Protocol 2000. International Organization for Standardization Effects of Diesel Exhaust Fluid (DEF) Injection Configurations on Deposit Formation in the SCR System of a Diesel Engine. SAE Technical Paper Nonlinear model predictive control of integrated diesel engine and selective catalytic reduction system for simultaneous fuel economy improvement and emissions reduction Integrated diesel engine and selective catalytic reduction system active NO x control for fuel economy improvement Analysis of heavy-duty diesel truck activity and fuel economy based on electronic control module data Information Technology, Communication and Control, Environment, and Management ( HNICEM ) Impact of Driver Behavior on Fuel Consumption: Classification, Evaluation and Prediction Using Machine Learning Application of machine learning for fuel consumption modelling of trucks Impact of driver behavior on fuel consumption: Classification, evaluation and prediction using machine learning Evaluation of ecodriving performances and teaching method: Comparing training and simple advice Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI Understanding the origins and variability of the fuel consumption gap: Lessons learned from laboratory tests and a real-driving campaign Environmental effects of driving style: impact on fuel consumption Identifying main and interaction effects of risk factors to predict intensive care admission in patients hospitalized with COVID-19: a retrospective cohort study in Hong Kong Interpretable prediction of goals in soccer A Unified Framework for Machine Learning Interpretability Interpretable machine learning. Lulu.com Accurate intelligible models with pairwise interactions Generalized additive models: some applications Rule Extraction in Unsupervised Anomaly Detection for Model Explainability Fuel consumption ratings; 2021. Data retrieved from Natural Resources Canada, Government of Canada Fuel Economy Data; 2021. Data retrieved from United States Environmental Protection Agency Vehicle Certification Agency, Car fuel and emissions information; 2021. Data retrieved from Vehicle Certification Agency Green Vehicle Guide, Car fuel and emissions information; 2021. Data retrieved from Green Vehicle Guide Regularization and variable selection via the elastic net Xgboost: A scalable tree boosting system Lightgbm: A highly efficient gradient boosting decision tree Industrial and business forecasting methods: A practical guide to exponential smoothing and curve fitting Mean absolute percentage error for regression models A primer on partial least squares structural equation modeling (PLS-SEM) Partial least squares structural equation modeling: Rigorous applications, better results and higher acceptance Structural equation modeling analysis with small samples using partial least squares. Statistical strategies for small sample research Resources for XAI Fuel Recommendations Technologies and approaches to reducing the fuel consumption of medium-and heavy-duty vehicles SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python XGBoost: eXtreme Gradient Boosting Scikit-learn: Machine Learning in Python Método y Programas de Ordenador para Gestión de Flotas de Vehículos