key: cord-0591075-ds7qatsr authors: Obst, David; Vilmarest, Joseph de; Goude, Yannig title: Adaptive Methods for Short-Term Electricity Load Forecasting During COVID-19 Lockdown in France date: 2020-09-14 journal: nan DOI: nan sha: 182291ce8e52f53141e51a57af846f407de0e6e0 doc_id: 591075 cord_uid: ds7qatsr The coronavirus disease 2019 (COVID-19) pandemic has urged many governments in the world to enforce a strict lockdown where all nonessential businesses are closed and citizens are ordered to stay at home. One of the consequences of this policy is a significant change in electricity consumption patterns. Since load forecasting models rely on calendar or meteorological information and are trained on historical data, they fail to capture the significant break caused by the lockdown and have exhibited poor performances since the beginning of the pandemic. This makes the scheduling of the electricity production challenging, and has a high cost for both electricity producers and grid operators. In this paper we introduce adaptive generalized additive models using Kalman filters and fine-tuning to adjust to new electricity consumption patterns. Additionally, knowledge from the lockdown in Italy is transferred to anticipate the change of behavior in France. The proposed methods are applied to forecast the electricity demand during the French lockdown period, where they demonstrate their ability to significantly reduce prediction errors compared to traditional models. Finally expert aggregation is used to leverage the specificities of each predictions and enhance results even further. A CCURATE electricity load forecasting is of paramount importance for the balancing of the electricity grid, since they are the main inputs of the production planning at different horizons [1] and storage capacities are still limited regarding the consumption needs. Load forecasting is performed at different horizons of time, ranging from intra-day (10 minutes to 24 hours ahead) to daily, weekly, monthly or even a few years in advance for industrial needs covering production planning, demand response, grid management, electricity trading, risk management, optimization of production units maintenance and commercialization. The field has been thoroughly studied the past decades, especially by the time series, statistics and machine learning communities. Time series approaches are very efficient for very-short term forecasts (typically less than 24 hours ahead). They rely on auto-regressive moving-average (ARIMA) models [2] or functional approaches [3] , [4] exploiting daily and weekly patterns in the electricity load data. Statistical and machine learning models are usually stronger for short and mid-term predictions (more than 1 day ahead). They use calendar characteristics (such as the time of the year, day of the week...) as well as meteorological effects (temperature, * The first two authors have contributed equally to this work. D. Obst (david.obst@edf.fr), J. de Vilmarest (joseph.de-vilmarest@edf.fr) and Y. Goude (yannig.goude@edf.fr) are withÉlectricité de France R&D. wind speed) or tariff options as inputs and are then trained on a large set of historical data (usually 3 to 5 years). A good overview of load forecasting practices has been given by the Global Energy Forecasting Competition (GEFCOM) 2012 [5] . Popular algorithms include gradient boosting machines [6] , neural networks [7] , [8] or Generalized Additive Models (GAM) [9] , [10] , [11] . These semi-parametric models are very attractive to electric utilities as they combine the flexibility of fully nonparametric models, the simplicity of multiple regression model and are computationally efficient to scale with big data [12] . The main French electricity provider, EDF (Électricit de France) uses GAM as their lead forecasting tool. However the coronavirus pandemic has significantly affected consumption patterns all over the world. As presented in [13] , [14] , the closure of nonessential businesses as well as the stay-at-home directives have led to a significant drop of the power demand and changes in the daily consumption patterns. Figure 1 shows the French and Italian electricity load over time in 2020, whose decrease due to the lockdown (which happens before in Italy) is clearly seen. Daily profiles of the French consumption before and after the lockdown are represented in Figure 2 . After lockdown for both countries the daily shapes of the load have converged towards the one of Saturdays. Since models are trained on historical data and make the underlying assumption that future behavior will be similar to past one, they will fail to produce satisfactory predictions during the lockdown period. For instance in France GAM usually achieve around 1% MAPE (mean absolute percentage error) [9] , but were around 5% during the first few weeks of the lockdown thus necessitating expert intervention to correct the model forecasts. Not only do these poor forecasts have a high cost for electricity producers and system operators, but they represent a threat to the proper functioning of the electrical network as well, which could have even more consequences than usual during a pandemic. This is why finding novel approaches to better predict the load demand during these troubled times is of paramount importance. However to our knowledge, up to this date only a few papers have addressed this problem. [15] is among the first to propose an efficient strategy to improve the predictions during the COVID-19 lockdown period in France. Using an adaptive functional state-space model and assimilating the period to non-workable days, the author was able to achieve significantly better performance compared to the french system operator. In [16] the integration of mobility data is combined with multi-task learning to improve the forecasting during the lockdown. They show that mobility is indeed a relevant feature that should be integrated in load demand models, and that joint training of a neural network for multiple geographical areas yields additional benefits and compensates for the lack of data. However none of these papers are investigating how GAM could be improved during the lockdown period. We consider here the framework of GAM and propose two new adaptive versions of these models. The idea of adaptive models is to take advantage of data observed in an online fashion to update an initial model. In every adaptive forecasting method a trade-off has to be found between a good reactivity to a change (whether it is a smooth drift or a break) and a good behavior during stable periods. One of the most popular algorithm for that is the Kalman filter [17] already applied to electricity load forecasting in [18] and [19] . We propose here to couple Kalman filters with GAM to obtain a forecasting procedure which performs well before the lockdown, exploiting the nice properties of GAM but also reacting quickly to the sudden change in the data at the beginning of the lockdown. The second approach we present leverages ideas from transfer learning to fine-tune a GAM on the lockdown period. Transfer learning (also referred as learning-to-learn or knowledge transfer) is a branch of machine learning that aims at reusing knowledge from one source task on another target one [20] , [21] . It has shown great success, particularly when the source data is plentily available and the target one scarce. Recently it has even found applications for electricity load forecasting to transfer information from one set of customers to another one [22] . In our case our source data will be the data before the lockdown and the target one the data during the lockdown in the country of interest (France in our study), or even a similar one where the lockdown came before (e.g. Italy here). The contributions of our work are the following: 1) Two mathematical approaches are proposed to efficiently adjust a historical model to consumer behavior change over time, even in the case where data is scarce. Furthermore they do not require the integration of additional features. 2) The two methodologies have been successfully applied on the difficult period of the COVID-19 lockdown in France, achieving forecast accuracy close to the one observed before the pandemic. 3) An empirical strategy is suggested to anticipate the impact of the lockdown on the load using another country's data, thus enabling satisfactory predictions from the very first day of stay-at-home order. The rest of the paper is organized as following. In Section 2 we introduce the two model adaptation methods relying on Kalman filtering and fine-tuning. Section 3 presents the data and the GAM model used for the French load and Section 4 summarizes the main results of our experiments. Finally Section 5 concludes our study and suggests further work. We consider additive models whose assumption is that the response variable y t is decomposed as where (ε t ) is an independent identically distributed (i.i.d.) random noise, x t = (x t,1 , ..., x t,d ) are the explanatory variables at time t, and each nonlinear effect f j is decomposed on a spline basis (B j,k ) with coefficients β j : where m j depends on the dimension of the spline basis. The coefficients β 0 , β 1 , . . . , β d then are estimated by penalized least-squares. The penalty term involves the second derivatives of the functions f j , forcing the effects to be smooth (see [23] ). The random residuals ε t are supposed to be Gaussian i.i.d. in the first place. Later in the numerical experiments we will introduce another variant of this model, where the residuals are supposed to be an ARIMA model optimised with classical time series methods. We focus here on structural adaptation of the GAM over time. We present two different levels of adaptation. First, we consider the reduced problem of adapting a linear combination of the frozen effects f 1 , ..., f d . Secondly we try to adapt the whole model by fine-tuning. In order to reduce the dimension of the adaptation problem, a strategy is to freeze the nonlinear effects, and to correct these effects by a multiplicative factor. Precisely, we define where f j is a normalized version of f j obtained by subtracting the mean on the train set and dividing by the standard deviation. Then we adaptively estimate θ t such that Thus we reduce the number of coefficients from 1 + d j=1 m j to 1 + d. This is a good trade-off to obtain a simple model which will react quickly to a break in the data generation process but also complex enough to fit well with the nonlinear properties of the load. 1) Exponential Least-Squares: An empirical method consists in solving at each step a least-squares problem where we specify a weight decreasing exponentially with the time difference. Precisely we definê and we predictŷ t =θ t f (x t ). This formalisation leads to a single parameter, the exponential forgetting factor µ. The advantage of this type of adaptation lies in its simplicity. The forgetting factor µ is determined by minimizing the RMSE on a validation set composed of the last year of the train set for a GAM trained on the beginning of the train set, then we keep the same µ for the GAM trained on the whole train set. Previous work has been done on estimating this parameter online, but leads to computational issues and potential instability of the model (see [24] ). 2) Kalman Filter: We present also a state-space model approach. We assume the following equations: where (ε t ) and (η t ) are gaussian white noises of respective variance / covariance σ 2 and Q. This is the setting of Kalman filtering [17] , thus we use the recursive formulae of Kalman providing the expectation and covariance of the state θ t given the past observations, and these estimators yield the mean and variance of y t given the past. This is described in Algorithm 1. There is a wide literature concerning the setting of the hyper-parametersθ 1 , P 1 , σ 2 , Q on which the Kalman Filter crucially relies, see for instance [25] , [26] , [27] . We observe that the iterates ofθ t depend only onθ 1 , P * 1 = P 1 /σ 2 and Q * = Q/σ 2 , reducing the set of hyper-parameters as in [25] . An interesting degenerate covariance matrix is the static setting Q * = 0 (the state equation becomes θ t+1 = θ t ). Definingθ 1 = 0, P * 1 = I, the estimateθ t is a regularized empirical risk minimizer: In order to obtain a dynamic setting we maximize the likelihood on the training set. The Expectation-Maximization Algorithm 1: Kalman Filter Recursion: at each time step t = 1, 2, . . . 2) Estimation: algorithm is a renowned algorithm allowing to find a local optimum. However the lack of global guarantee makes it inefficient in our case, and we chose to apply some kind of grid search. Precisely we decided to set P * 1 = I as in the static setting, and for a given Q * the optimalθ 1 for the likelihood has a closed-form solution. Q * is of dimension 10 × 10 and we chose to restrict ourselves to diagonal matrices whose coefficients are in the set {2 j , −30 ≤ j ≤ 0}. This is still a set of around 8 · 10 14 elements, thus we used an iterative greedy procedure: we start from Q * (0) = 0 and at each step, having Q * (k) in hand, we compute the likelihood of each matrix where only one coefficient differ from Q * (k) , and we define Q * (k+1) as the one maximizing the likelihood among those tested. This algorithm yielded less than 10 4 evaluations of the likelihood. In order to take the lockdown into account in the state-space representation, it is natural to consider a varying state noise covariance Q t . Indeed, we expect the model to change much faster during and after the lockdown than before. It motivates a dynamic estimation of Q t , however due to the amplitude of the crisis we modelled a break in the data at the lockdown beginning. We chose to change only the state noise covariance at the break time T , and for t = T we use Q * t = 0 in the static setting or Q * t = Q * in the dynamic setting. We don't want to put any a priori on the break, therefore we defined In the previous methods the nonlinear effects f j (·) were frozen and adjusted with a multiplicative factor. However it may be insufficient on certain new types of behavior. Since learning a new model from scratch is inadvisable considering the few samples of target data available, we would like to start from the previously trained model and adapt it on the few instances available. This is a particular case of the framework of transfer learning, more specifically of model fine-tuning (FT). It consists in reusing a part of the parameters learned on the source set (typically neural network layers) and adjust them with a few gradient iterations on the target one for instance. Model fine-tuning has been successful in different fields such as computer vision [28] or even time series forecasting [29] . In our case we will fine-tune the parameters of our GAM. Since it boils down to a penalized linear regression problem, fine-tuning on it consists in fine-tuning a linear model. This framework was elaborated in [30] . Starting from the coefficientsβ S learned on the source data, for each time step we perform K iterations of batch gradient descent with fixed step size α on following objective function to yield an adjusted parameter vectorβ t : Let B(x s ) be the vector of the B j,k (x s,j ) and B(X t ) denote the matrix made by the concatenation (by row) of the B(x s ) for s = 1, . . . , t−1. As discussed by the aforementioned paper, the choice of the step size α is not crucial, as long as it is small enough. In practice a good step size is α = α * /5 where α * = 2/ λ max (B(X t ) B(X t ))+λ min (B(X t ) B(X t )) and λ max (M ) and λ min (M ) respectively designate the maximum and minimum eigenvalue of M . Ergo the major hyperparameter to tune is K the number of gradient iterations to perform. Theoretical methods are currently being investigated in the aforementioned paper and have been used to guide our choice here, but it was also observed empirically that for K between 50 and 100 the results are often good. Therefore a number of iterations in that range is always considered, and this choice usually coincides with the suggested theoretical guidelines. In this section we detail the GAM model that has been used to forecast the French electricity consumption, as well as the data on which is has been applied. The French electricity consumption is freely available on the website of the system operator RTE (Réseau et Transport d'Électricité) 1 . Our dataset ranges from the 1 st of January 2012 to the 7 th of June 2020 with a 30 minutes temporal resolution. As explanatory variables we obtained national averaged temperature on the website of the French weather forecaster Météo-France 2 . We took observed temperatures instead of forecasts in order to use only open data and make the results reproducible. As our goal is to compare different forecasting strategies on the same data this choice is relevant and allows a more precise comparison as we don't include in the score the uncertainty due to physical meteorological forecast. We train the models on historical data from the beginning of 2012 to the end of August 2019. In this paper we are interested in predicting the load during and after the COVID-19 lockdown period in France. Since the consumer behavior changed abruptly during the first month and stabilized during the second one, we divide the crisis test data in two periods. The first one ranges from March 16 th to April 15 th and the second one from April 16 th to June 7 th . Note that although the 1 https://opendata.rte-france.com 2 https://donneespubliques.meteofrance.fr/ lockdown officially begun Tuesday the 17 th of March 2020 at midday in France, we consider March 16 th as the first day of our lockdown period as the behavior had already changed. Finally, in order to assess the suitability of the offline methods and of the ones that do not model the break we consider the pre-lockdown period between September 1 st 2019 and March 15 th 2020. The time of day is crucial for load forecasting. It doesn't appear in the following definition of the additive model because we build one model for each instant of day, i.e. we treat the 48 half-hour time series independently: where at each time t, • y t is the electricity load for the considered instant, • DayType t is a categorical variable indicating the type of the day of the week, • DLS t is a binary variable indicating whether t is in summer hour or winter hour, • ToY t is the time of year whose value grows linearly from 0 the 1 st of January 00h00 to 1 on the 31 st of December 23h30, • Temp t is the temperature, • Temp95 t and Temp99 t are exponentially smoothed temperatures of smoothing factor 0.95 and 0.99, • TempMin99 t and TempMax99 t are exponentially smoothed variables of factor 0.99 of the minimal and maximal temperature of the day, • Load1D and Load1W are the load of the day before and the load of the week before. The models are trained in R using the library mgcv [31] . As previously mentioned in Section II, we suppose that ε t is a Gaussian noise with 0 mean and constant variance. However this hypothesis is rarely true in practice and we observe an auto-correlation structure in the error. We thus propose to model it with an ARIMA model by selecting the best model with AIC criteria [32] in the family of ARIMA(p,d,q) where p, q ≤ 100 and d ≤ 1 (we use the R function auto.arima of R. Hyndman). In that case the forecast are performed adding GAM forecasts and the short term correction of the ARIMA models exploiting recent observations. Italy was the first country to be massively affected by the novel coronavirus in Europe. The Italian government decreed a total lockdown from the 9 th of March 2020, hence 7 days before the French one. Also it seems reasonable to make the assumption that countries will respond to the same stay-athome order in similar ways, which is reasonable considering Figure 1 . Hence our idea is to use this one week head-start and to adjust our GAM model for France accordingly to the changes observed in Italy. We have at our disposal data from the Italian system operator Terna 3 and meteorological data gathered through the R package Riem available from the 1 st of January 2015 to the 28 th of June 2020 with a 1 hour temporal resolution. For each instant, a model similar to (1) is constructed on the data on the range 2015-2019, with the main differences being that the effects f 3 (·) and f 6 (·) are removed, and that f 2 (·) is replaced by a sum of 7 effects, one for each day of the week. Then the same procedure as described in Section II-B is applied. Letδ t denote the adjustment of the estimated coefficients obtained by performing the aforementioned fine-tuning procedure on the Italian data ranging from the 28 th of February up to date t (typically t could correspond to the 15 th of March, the day before the stay-at-home order begun in France). We then usẽ β t =β F R S + ρδ t to perform the predictions for France, wherê β F R S is the French source parameters vector and ρ is a scale parameter accounting for the difference of load levels between France and Italy. We refer to this model as GAM-δ. Since the ToY effect is modelized differently for the Italian model (one function per day of the week), we will not adjust the corresponding coefficients in the French model. This is further justified by the fact that in general the ToY effect is very specific to a country, and it should be learned on a whole year at least. As for the choice of ρ, making the assumption that the consumption in France and Italy are proportional with a factor ρ allows us to use the simple estimateρ = t y F R t / t y IT t summed over a year for instance. The advantage of GAM-δ is that it can be applied to reduce the prediction error starting at the very first day of lockdown. One can afterwards combine this procedure with fine-tuning on the eventually available French data. The procedures for both regular fine-tuning and GAM-δ are summarized in Algorithm 2. The presented adaption methods are used for the French electricity load forecasting problem. While accuracy metrics are of paramount importance, we also focus on the interpretation of our results and on model behavior. The moving average of the error of the different models are represented in Figure 3 . At the beginning of the lockdown all the models will tend to overpredict the load. However most of our adaptive methods quickly accommodate to the lower demand and progressively reduce their bias, notably Kalman with dynamic break and GAM fine-tuned. On the contrary regular GAM does not succeed in reducing the error (even with the help of an ARIMA) as it keeps overpredicting the demand. GAM-δ on the other hand is very good during the first couple of days, efficiently taking advantage of the change Algorithm 2: Transfer learning at time step t Inputs: Step size α, number of iterations K, French and Italian source parametersβ F R S ,β IT S , scale parameter ρ. If GAM fine-tuned: If GAM-δ: If GAM-δ fine-tuned: • Do fine-tuning on Italian data:β t =β F R S + ρδ t . in patterns observed in Italy. However it quickly drifts away over time because the Italian consumption recovers faster than the French one during the second month of lockdown (see fig. 1 ). However since the objective of GAM-δ is to provide an initial boost of performance during the first couple of weeks while the other models adjust, this is only a minor issue (see Section IV-B). We test the Kalman filter in a static and a dynamic setting as described in Section II-A2. For both we assess the introduction of a break at lockdown. The evolution of the state estimateθ t is displayed in Figure 4 for different settings. In the static setting the Kalman filter optimizes a state which is assumed to be constant, hence explaining a slow evolution compared to the faster changes of the dynamic one. Moreover, the model changes faster during lockdown in both settings. As expected the introduction of a break covariance matrix at the beginning of the lockdown allows the model to adapt much faster. The model dynamics can be analysed for the fine-tuning too. The only coefficients ofδ t with a significant evolution after fine-tuning are the ones pertaining to the lagged load (γ for Load1W and β i , i = 1..7 for Load1D) and have been represented in Figure 5 . The other ones are zero and have been omitted for clarity. The coefficients of the working days drop, especially the Monday, whereas the ones of the weekend increase, notably Saturday. It can be interpreted as follows: the historical model learned a certain transition between the different days of the week. With the lockdown now all the days are similar and close to a Saturday, which has a lower demand than Monday and thus the associated coefficient plummets. The coefficient of Saturday soars because the demand on Fridays is now much lower than it used to be and that daily profiles are similar. Finally since during the first weeks the electricity demand progressively decreases (see Fig. 1 ) the coefficient of γ drops as well. We proposed 2 load forecasting models (ARIMA, GAM) and different variants to adapt them to the lockdown period (exp-LS, Kalman adaptation, transfer learning) leading to 11 candidates. A natural approach is then to aggregate them in a single forecast which will take benefit of the best one in function of time. This is the main idea behind online aggregation methods which has already demonstrated its benefits in the field of electricity load forecasting (see [33] , [34] ). Since Figure 2 shows the convergence of the daily profiles towards the Saturday shape, this as well as [15] motivates adding another expert named GAM Saturday, where the prediction is made by the regular GAM as if every day was a Saturday. We recall briefly the main principles of the online aggregation approach and refer the interested reader to [35] for a complete presentation. A bounded sequence of observations (here half-hourly total consumption of customers) y 1 , . . . , y n ∈ [0, B] is observed (B being an unknown constant). We have access to a set of N experts who produce forecasts of the sequence at each instant t based on past values of y. After that, aggregation is computed step by step:ŷ t = N j=1p j,tŷ j t where the weights are updated according to past performances of each experts. To compute the weights we use the ML-Poly algorithm of [36] , implemented in the R package OPERA [37] . To summerize the procedure, the algorithm puts more weight on an expert which improves the performance of the aggregation in the past using a gradient descent like strategy with a vectorial time varying step (also called the learning rate) η k,t depending only on the past performances of the experts so that no parameter tuning is needed. Finally a few experts are introduced in the aggregation only at lockdown. Indeed the transfer learning experts don't make sense (there is no target data), the Kalman experts modelling the break coincide with the other ones before lockdown, and the expert considering every day is a Saturday was only introduced for the lockdown period. These specialized experts are added to the aggregation at the lockdown period with a uniform weight (1/12), and the experts present before share the rest of the weight proportionally to their previous weight [38] . The evolution of the weights of the experts over time is given in Figure 6 . It gives insight on which predictions are the most useful in the aggregation at a given time. The lockdown acts as a break and causes a significant shift in the weights distribution. As such, GAM Saturday immediately takes a large weight: this is due to the aforementioned resemblance between the daily profiles during the lockdown with Saturdays. Moreover, this expert predicts a lower consumption than reality, compensating for the overestimation of the other experts at the beginning of the lockdown. GAM-δ also has high importance, as it has knowledge of what happened in Italy and thus suits the new patterns of load demand in France. For instance on the two first days of lockdown (16 and 17 th of March) GAM-δ yields 1984 MW of RMSE, compared to 2674 and 3005 for Dynamic Break and regular GAM respectively. However their importance dwindle with time as the adaptive Kalman and fine-tuning methods have seen enough data and have become more competitive. As usual in electricity load forecasting, the performance metrics is the root mean squared error (in MW) and the mean absolute percentage error (in %): where n is the number of instances in the test set. We display the numerical performance of our methods in Table I . The benefits brought by any of our methods is clear, with RMSE and MAPE that are significantly lower than a standard GAM+ARIMA on both COVID-19 test sets. The Kalman with Dynamic Break yields the best results for the two error metrics on both test sets, but the fine-tuned methods are very close to it. The additional benefits brought by expert aggregation is emphasized by the two last rows. The algorithm manages to take advantage of the individual specificities of the different predictions, leading to further error reduction on both test periods. It is interesting to note that while individually poor (see Table I ), the inclusion of the GAM Saturday in the mixture is of paramount importance for the first testing period. This is because it compensates for the bias of the other experts (they tend to overestimate the demand whereas GAM Saturday underestimates it). V. CONCLUSION In this paper, we proposed two novel approaches of adaptive generalized additive models, one relying on Kalman filtering and the other on transfer learning with GAM fine-tuning. Kalman philosophy consists in reacting quickly to a change in the data and update the forecasting taking advantage on recent observations. Transfer allows to share information from other data sets with similar/complementary properties. The methods have been applied on real French electricity consumption data from the COVID-19 lockdown period. We show the benefits of the transfer approach to anticipate the lockdown effect using Italian data and demonstrate the efficiency of adaptive methods to significantly improved predictions compared to benchmark models without relying on the inclusion of new exogenous features. Moreover expert aggregation enabled to take advantage of the individual experts' specificities and enhanced the results even further. While in this paper we focused on adapting GAM, the proposed framework can be applied to other approaches. The use of neural networks for instance, with their high performance in the field of load forecasting, will soon be investigated. We also plan to include other exogenous information as mobility data proposed in [16] or macro-economic indicators. Regarding load data, we believe that exploiting regional data could be pertinent as the propagation of the pandemic and its impact on consumption was different depending the region in France and Italy. Also, we would like to include more countries. For these next steps, transfer approaches will obviously be of fundamental importance but also adaptive questions as the effect of this exogenous variables will probably vary with time. Comparative models for electrical load forecasting Short-term load forecasting via arma model identification including non-gaussian process considerations A prediction interval for a function-valued forecast model: Application to load forecasting Modeling and forecasting daily electricity load curves: a hybrid approach Global energy forecasting competition 2012 Gefcom2012 hierarchical load forecasting: Gradient boosting machines and gaussian processes Electric load forecasting using an artificial neural network Deep neural network based demand side short term load forecasting Short-term electricity load forecasting with generalized additive models Local Short and Middle term Electricity Load Forecasting with semi-parametric additive models Fast calibrated additive quantile regression Generalized additive models for large data sets Changes in electricity demand pattern in europe due to covid-19 shutdowns Year-on-year change in weekly electricity demand, weather corrected, in selected countries France short-term load demand forecasting using a functional state space adaptative model: case of covid-19 lockdown period Using mobility for electrical load forecasting during the covid-19 pandemic A new approach to linear filtering and prediction problems Forecasting hourly electricity demand using time-varying splines Dynamic factors in state-space models for hourly electricity load signal decomposition and forecasting A survey on transfer learning. ieee transactions on knowledge and data engineering A survey of transfer learning Forecasting customers' response to incentives during peak periods: A transfer learning approach Generalized additive models: an introduction with R Adaptive learning of smoothing functions: Application to electricity load forecasting Introduction to time series and forecasting Time series analysis by state space methods Multivariate statistical modelling based on generalized linear models Deep convolutional neural networks for computer-aided detection: Cnn architectures, dataset characteristics and transfer learning Reconstruction and regression loss for time-series transfer learning Transfer learning for linear regression: a statistical test of gain Package mgcv Time series analysis and control through parametric models Forecasting electricity consumption by aggregating experts; how to design a good set of experts," in Modeling and stochastic learning for forecasting in high dimensions Aggregation of multi-scale experts for bottom-up load forecasting Prediction, Learning, and Games A second-order bound with excess losses opera: Online prediction by expert aggregation Forecasting electricity consumption by aggregating specialized experts